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Abstract 

Given a random sample from a parametric model, we show how indirect inference estima- 
tors based on appropriate nonparametric density estimators (i.e., simulation-based minimum 
distance estimators) can be constructed that, under mild assumptions, are asymptotically 
normal with variance-covarince matrix equal to the Cramer-Rao bound. 

1 Introduction 

Suppose we observe a random sample Xi, . . . , Xn from a distribution P, and we are in the 
classical situation where one maintains a parametric model A4 ={P{6) : 9 e 0} of probability 
measures P{0), indexed by the set C R'', for statistical inference. Under the assumption 
of correct specification of the parametric model, i.e., P = P{9o) for a (unique) 6o e 6, the 
maximum likelihood estimator (MLE) is often a natural estimator of 9o (as well as of P{9q)), 
since it is asymptotically efficient under well-known regularity conditions. 

There are several reasons, however, why maximum likelihood might nevertheless not be the 
method of choice, and alternatives, that ideally are also asymptotically efficient, are of interest. 

A first such reason is rather classical (e.g., Huber (1972), Beran (1977), Millar (1981), Donoho 
and Liu (1988), Lindsay (1994)) and comes from robustness considerations: A good estimator for 
^0 should be robust against misspecifications of . A lesson from the above-mentioned literature 
is the following: If one wants an estimator of 9o that is robust against perturbations of P{9q) in 
some metric th^n one should rather use 'minimum distance estimators' of the following 

form: if P„ is a suitable (typically nonparametric) ^-consistent estimator of P, estimate 9 by the 
minimizer over Q of 

Q^{9):^x{Pn.P{e)). (1) 

Under several assumptions, Beran (1977) showed the interesting result that, if x is the Hellinger- 
distance, and if P„ is some kernel density estimator, such minimum-distance estimators are not 
only robust, but actually simultaneously asymptotically efficient, so that they outperform the 
MLE in this sense. We will discuss the asymptotic efficiency aspect of his result in more detail 
below. 
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A second, more practical reason against the use of the MLE that has arisen in recent apph- 
cations in econometrics and biostatistics is related to the fact that in these applications analytic 
expressions for the densities in the parametric model, and hence for the likelihood function, are 
not available (or intractable for numerical purposes). For example, the data may be modeled 
by an equation of the form Xi = g{£i,0o), but the implied parametric density may not be ana- 
lytically tractable, e.g., because g is complicated or Si is high-dimensional. The same problem 
occurs naturally also in estimation of dynamic nonlinear models including stochastic differen- 
tial equations, we refer to Smith (1993), Gourieroux, Monfort and Renault (1993), Gallant and 
Tauchen (1996), Gallant and Long (1997) and the monograph Gourieroux and Monfort (1996) 
for several concrete examples. This problem has led to a growing literature about so-called in- 
direct inference methods, where other estimators than the MLE are suggested, often based on 
simulations, see the just mentioned references and Jiang and TurnbuU (2004). From a conceptual 
point of view, the main idea behind the indirect inference approach can be phrased as follows: 

1. Simulate a sample Xi{d), ...,Xk{0) of size k from the distribution P{0) for 6* £ 6 (which 
is often possible in the examples alluded to above, e.g., by perusing the equations defining 
the model; see also Remark [Ij. 

2. Based on the simulated sample as well as on the true data, compute estimators Pk{d) and 
P„ in a not necessarily correctly-specified but numerically tractable auxiliary model J\/l°'"^ . 
[For example, by maximum likelihood if ^'^"^ is finite-dimensional.] 

3. Choose a suitable metric x on tVI""^^ and estimate Oq by minimizing over 9 the objective 
function 

QnMO)-=X{Pn,Pkm. (2) 

In most of the indirect inference literature, the auxiliary model _A/f'''" is also finite-dimensional 
(so that one in fact estimates a finite-dimensional parameter in Step 2 rather than the probability 
measure directly) , and the resulting procedure can be shown to be consistent and asymptotically 
normal (under standard regularity conditions, see Gourieroux and Monfort (1996)). However, 
the procedure is asymptotically efficient only if ^""^ happens to be correctly specified. This 
assumption is certainly restrictive and often unnatural if ^""^ is of fixed finite dimension. 
Therefore Gallant and Long (1997) suggested that choosing ^''"^ with dimension increasing 
in sample size should result in estimators that are asymptotically efficient, the idea being that 
this essentially amounts to choosing an infinite-dimensional auxiliary model for which the 

assumption of correct specification is much less restrictive. 

In the present paper we show in some generality that indirect inference estimators based on 
suitable nonparametric estimators Pn and Pk{0) with common choices for the tuning parameters 
('sieve'-dimensions) , including rate-optimal choices, are asymptotically efficient in the sense that 
they are asymptotically normal with asymptotic variance equal to the Cramer-Rao bound. To the 
best of our knowledge, no proof of this fact was known before, although there are some related 
results that need mentioning. We comment on the literature in some detail below, but first wish 
to discuss the main ideas behind our results. [Robustness issues, misspecification of 7W, as well 
as uniformity in the asymptotic normality result are not treated explicitly in this paper; for the 
latter two issues in a related context see Gach (2010).] 

From the discussion so far it transpires that indirect inference estimators from ^ are min- 
imum distance estimators, with the important (and nontrivial) modification that P{0) in ([T]) 
is replaced by an estimator based on simulations from P{0). It is therefore of interest to first 
briefly revisit Beran's (1977) asymptotic efficiency result: For simplicity, consider the Fisher- 
metric Xrifj 9)^ '■— lif ~ Q^Po^j where po is the density of P, instead of the Hellinger distance. 
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[Note that the Fisher-metric is closely related to the Hellinger distance when / and g are near 
Po-] If Qn is the minimizer of Qn in ([l}, then, after a suitable Taylor expansion, asymptotic 
efficiency of ^/n[Qn — Oq) essentially reduces to proving two separate results: The first is to prove 
asymptotic normality for the gradient of (HJ at namely 

V^iJs{9o)d{P„~P{0o)), (3) 

where the 'influence function' s{9q) equals Vep(0o)Po^- Note that s{9o) coincides with the 
efficient infiuence function in this problem, showing that x — Xf is & natural choice. The 
second step is to control the remainder term in the Taylor expansion, which essentially requires 
convergence of P„ to P = P{0o) (in the sense of L^-convergence of the respective densities for 
certain values of p). Beran (1977) implicitly proved these two results under relatively restrictive 
conditions if P„ is a kernel density estimator with certain bandwidths, and if x is the Hellinger 
metric. It is typically not sensible (and for the most interesting metrics x in feet not possible) 
to take Pn to be the empirical measure itself, but rather P„ should be some smoothed version 
of it. In this case, one cannot directly apply a standard central limit theorem to ([3]). However, 
recent results in empirical process theory (Nickl (2007), Gine and Nickl (2008, 2009b)) establish 
exactly such limit theorems for various density estimators. Furthermore, these limit theorems 
also hold for density estimators that simultaneously deliver optimal convergence rates in L^-type 
loss functions, which is potentially relevant for good control of the remainder term. (We should 
note that this simultaneous optimality property is related to what Bickel and Ritov (2003) label 
the 'plug-in property' of the density estimator P„, cf. also Section 3 in Nickl (2007) for more 
discussion.) Using similar methods we first prove a Beran- type result (Theorem [5]) , under quite 
weak (if not sharp) conditions, for the case where x = Xf (but with the unknown po replaced by 
an estimator), and where the underlying nonparametric estimator is based on a £^-projection of 
the empirical measure onto spaces of piecewise polynomials spanned by dyadic _B-splines. 

Once asymptotic normality of the minimum distance estimator in (H)) is established, the 
question arises how the simulation step in ^ should be approached. Here two proof strategies 
arise: 

1. The first method is to show that the objective function Qn.k with simulations is stochasti- 
cally close, uniformly over Q, to the objective function Qn where no simulation is performed. 
If 

sup|S„.fe(^)-Q„(0)| (4) 

has a sufficiently fast rate of convergence to zero (in probability), then it is not difficult to 
show, using a result from Gach (2010), that the asymptotic distribution of the simulated 
indirect inference estimator obtained from minimizing ^ is the same as the one of the 
classical minimum distance estimator discussed in the previous paragraph. It turns out 
that proving that the expression in ^ has a sufficiently fast rate of convergence to zero 
can be done by deriving sharp bounds for the stochastic processes 

n I fd{Pk{e) - P{0)) 



where is a relevant class of functions, and again we can apply recent techniques from 
empirical processes here (cf. Nickl (2007), Gine and Nickl (2008, 2009b) together with 
moment inequalities in Gine and Koltchinskii (2006)). We prove that if one performs 
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simulations of order k >> tt?, then the indirect inference estimators are asymptoticahy 
equivalent to the classical minimum distance estimators. A main advantage of this proof 
strategy is that no differentiability properties of the objective function Qn,k have to be used, 
and that in turn a large class of simulation mechanisms is admissible. More importantly, 
this proof strategy allows for the presumably critical condition r > 1/2 on the underlying 
density po i where r is the index governing the regularity of po ■ 

2. The method of proof described above works if many simulations are performed (fc >> n?). 
However, this condition is not intrinsic to the problem, and the case where the number of 
simulations /c is of a smaller order than n? is also of interest. In particular, in the case where 
fc/n^K, 0<K<cx), one has to expect that the asymptotic variance of simulated indirect 
inference estimators is inflated by the factor (1 + 1/k). If one is interested in these cases, the 
(comparably) 'brute force' methods described in the previous paragraph cannot be used. 
Alternatively, one can try to apply the usual M-estimation asymptotic normality proof 
to the criterion function Qn,k{0)- Among other things this requires differentiation of the 
simulated estimators Pk{0) with respect to 9. Since Pk{0) is constructed by applying an ap- 
proximate identity to the empirical measure from the simulated sample, the proofs become 
more delicate in this case. [Differentiating an approximate identity h~^K{X{9)/h) w.r.t. 9 
introduces a 'penalty' of an additional from the chain rule.] We are able, nevertheless, 
to establish asymptotic normality of the simulated indirect inference estimator with these 
simulation sizes as well, under slightly stronger conditions (on the underlying density and 
the simulation mechanism), and with the expected inflation of variances if lim„ k/n < oo. 
Again, the empirical process techniques mentioned in the previous paragraphs, together 
with some facts from approximation theory, are central to our proofs. 

We should comment on some related literature. Related papers are Gallant and Long (1997) 
and Fermanian and Salanie (2004). The first paper studies the case where P„ is based on 
nonparametric MLEs over sieves spanned by Hermite-polynomials, but their limiting result is 
only informative if the sieve dimension stays bounded (so that efficiency of the estimator is only 
established if the true density is a finite linear combination of Hermite-polynomials). Fermanian 
and Salanie (2004) propose different (but somewhat related) procedures, and establish asymptotic 
efficiency of their estimators under several high level conditions, which, as they admit themselves, 
are very stringent. Even in the simplest model they consider, they need to have simulations of 
order fc ~ , and the nonparametric estimators considered seem to be only sensible if the true 
density is very smooth. There are also some other related recent papers on this topic, Altissimo 
and Mele (2009) and Carrasco, Chernov, Florens, Ghysels (2007), whose proofs, however, we 
were not able to follow. 

The outline of the paper is as follows: After some preliminaries in Section 2, we introduce the 
model and assumptions, define the auxiliary spline projection estimators as well as the indirect 
inference estimator in Section 3 and present the main result (Theorem [T]) on asymptotic efficiency 
of the indirect inference estimator. Some basic facts on dyadic splines are summarized in Section 
4. Section 5 is devoted to the proof of Theorem [T] Section 6 develops auxiliary convergence rate 
results for the auxiliary spline projection estimators needed in the proof of Theorem [TJ Section 7 
establishes a uniform central limit theorem for spline projection estimators that is also essential 
in the proof of the main result. Three appendices contain further technical results on Besov 
spaces, projections onto Schoenberg spaces, and moment inequalities for empirical processes. 
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2 Preliminaries and Notation 



We denote the Euclidean norm of a vector x € by and the associated operator norm 
of a matrix A by \\A\\. With O' := £^([0,1], A), 1 < p < oo, we denote the vector space 
of Borel-measurable p-fold integrable real- valued functions on [0, 1], where A denotes Lebesgue 
measure on [0, 1], the (semi)norm on £p being denoted by \\h\\^. Furthermore, \\h\\^ stands for 
the supremum norm {not the essential supremum norm) of a real- valued function h defined on 
[0, 1]. If 7J is a vector- or matrix-valued function on [0, 1] then \\H\\^ is shorthand for || ||iJ|| \\^ and 
similarly for the supremum norm. By L°° we denote the space of all bounded Borel-measurable 
real-valued functions on [0, 1] endowed with the supremum norm. For a (measurable) real-valued 
function g on K and \ < p < oo we write ||<7||pj{ to denote its £P-(semi)norm (w.r.t. Lebesgue 
measure on M); and we write HsHoq r for the supremum norm (not the essential supremum norm). 
For sequences a„ and 5„ of positive real numbers we write a„ ^ 6„ to denote the fact that the 
sequence a„/6„ is bounded away from zero and infinity. 

We next introduce Besov spaces. For a function g : M — R and z S M, the difference operator 
A2 is defined by ^zg{') — g{' + z) — g{-) and inductively by A'^g(-) = Az{A'^~-^g{-)) for integer 
a > 2. For h : [0, 1] M, we define A"(/i)(x) as above if x,x + az e [0, 1], and set Al{h){x) — 
otherwise. For < s < 00 we define function spaces Bs on [0, 1] as follows. 

Definition 1 For s £ (0, 00) , a G (s, 00) n N, and h £ define 

ii/^iu.2:=ii/iii2+ sup izrwAim,. 

Define further 

■.= Bi^^{h£C^ ■.\\h\\sa<^}- 

The space Bg does not depend on a in the sense that different choices of a > s result in 
equivalent (semi)norms. For definiteness we shall always choose a to be the smallest integer 
larger than s in the sequel. It is well-known (Proposition[7]in Appendix [5| that for s > 1/2 every 
function in Bs is A-almost everywhere equal to a (uniquely determined) continuous function in Bs ■ 
It thus proves useful to define for s > 1/2 the Banach-space (B^, || • \\s.2) where = BgdCdO, 1]) 
and C([0, 1]) denotes the set of continuous real-valued functions on [0, 1]. 

A little reflection shows that Bs is just the usual Besov (or generalized Lipschitz) space B200 
as, e.g., defined in Chapter 2, Section 10 of DeVore and Lorentz (1993) (with the only difference 
that there Bs is viewed as a space of equivalence classes of functions). The space Bs contains 
the classical Sobolev space of order s as a subset. Recall that for integer s the Sobolev space of 
order s > is given by 

yy| = {/i e £2 . e iov < i < s, i integer} , 
where denotes the weak differential operator. Then for integer s > 

\\h\\s,2<C{s) ^ \\Dlhh (5) 

0<i<s 

holds for some universal constant C(s) and all h in the Sobolev space of order s; cf. p. 46 and p.52f 
in DeVore and Lorentz (1993). Some further properties of Besov spaces and their relationship to 
splines that we shall need in the sequel are summarized in Appendix |21 
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3 Main Results 



Let Xi, . . . , X„ be independent and identically distributed (i.i.d.) on a compact interval in K 
with law P and Lebesgue-density po- Without loss of generality we shall take this interval to 
be [0,1]. We assume that a parametric model Ve is given, i.e., Ve — {p{d) : 9 e 9}, where 
the functions p{9) : [0, 1] — > R are probability densities and the parameter space is a subset 
of M^ The probability measure on [0,1] corresponding to p{6) will be denoted by P{9). We 
consider here the case where direct likelihood methods for estimation of 9 cannot be used for 
the reasons outlined in the introduction. Suppose, however, that it is feasible to obtain for each 
6* e 8 simulated data Xi{9) via 

X,{9)^ p[V^,9), z = l,...,fc, (6) 

that are distributed i.i.d. with density and that are independent of the original sample. [The 
simulation mechanism may result from an equation for the data as described in Section [Tl but 
may also be obtained in some other way.] More precisely, we assume that the random variables 
Vi driving the simulation mechanism are i.i.d. with values in some measurable space (V,QJ), the 
distribution on V induced by Vi being denoted by /j,; furthermore, we assume that for every 9 G Q, 
the Q3-measurable function p{-,9) : V — > [0, 1] is such that the law of p{Vi,9) has density and 
that the collection of random variables {Vi} is independent of the collection {Xi}. As the main 
result depends only on the distribution of the random variables Xi and Vi, we can assume without 
loss of generality that the original data Xi as well as the variables Vi are defined as the respective 
coordinate projections on the product probability space ([0, 1]°° x V°°, i] ^ P°° <S) fJ-°°)', 
we shall denote by Pr the product probability measure P°° (E) ■ The basic framework outlined 
above will be maintained throughout the rest of the paper. 

Remark 1 To avoid possible misunderstanding we note the following: (i) Equation ^ implies 
that one needs to obtain one and only one simulated sample Vi, . . . ,Vk in order to compute Xi{9) 
for any 9 O. There is no need to separately draw random samples for every 9. (ii) Simulation 
mechanisms like ^ naturally occur in the domain of application of indirect inference which 
consists of statistical models where the data Xi are assumed to arise as the output of an equation 
that is parameterized by 9 and is driven by some stochastic noise variables. These stochastic 
noise variables then often play the role ofVi. 

We next construct auxiliary estimators for p^ from the original data as well as from the 
simulated data. The estimator of pq based on the original data is a spline projection estimator 
based on B-splines of order > 1 and is given by 

i=-r,+l 

with 

Here N^J'^ denote the B-spline basis functions forming a basis for the Schoenberg space Sj{r^) 

and the coefficients g^J''^''™' are the elements of 2^-' times the inverse of the Gram matrix of the 

B-spline basis N^^'^ ; see Section 0] for definitions. Furthermore, P„ — n^^ Yll=i denotes the 
empirical measure of the original data. The positive integer j represents a tuning parameter 
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that governs the dimension of the approximating space ('sieve') spanned by the B-sphne basis. 
Similarly, from each simulated data set Xi (9) , we construct estimators for p{9) based on order-r 
B-splines via 

2''-! 



with 



2-1 

^{9)^ E 2'^g«'"7 ivi^](x)dP,(0)(:.) (8) 



m— — r+l 



and Pk{9) — k^^ ^Xi(9)- Note that and r need not take the same value, nor need j and 

J. [For example, r — A would correspond to using cubic splines for the construction of Pk,j,r{0), 
while r* = 1 would correspond to using the Haar basis for the construction of Pn,j,r, •] In the 
sequel we shall often write Pk,j,r(0,y) for Pk,j,r(P){y) and similarly p{9,x) for p{9){x). 

The idea behind indirect inference is that, given the parametric model is correctly specified 
in the sense that po = p{9q) A- almost everywhere for some 9o G Q, the particular value of 9 
corresponding to the simulation-based estimator Pk,j,r(Q) closest to Pn,j,r, (in an appropriate 
metric) should provide a reasonable estimator 9n,k of ^o, since Pnj,r, will estimate po — p{9o) 
(A-a.e.) consistently (under appropriate assumptions and choices of j, J, and k). That is, as 
explained in Section [TJ the estimator 9n,k can be viewed as a simulation-based version of a 
minimum distance estimator. 

To implement this idea we introduce the indirect inference objective function measuring 
closeness of p„j-^r.and Pk,j,r{0) 

QnAO) Qn,k,,j,r,AO) = ( ^« " P>^M0)?P7ArJ>^ On the cvcut A„ 

l_ otherwise 

where An — {pnj„,r. (y) > for every y £ [0, 1]}, which is measurable as is easily seen. Note 
that Qn,k{0) ■■ [0, 1]°° X V°° ^- M is *8|^^j] (gi 5J°° -measurable for every 6* G 9 as a consequence 
of Tonelli's Theorem since Pn,j,r, and Pk,j,r{0) are both jointly measurable (w.r.t. the combined 
data and the argument y) and since An is measurable. Furthermore, since all functions involved 
are piecewise polynomials with dyadic breakpoints, the integral featuring in the definition of 
Qn,k{0) '^^^ computed in a numerically efficient way. 

Remark 2 (i) We have chosen to assign Qn.ki^) value zero on the complement of An for 
convenience. Since the event An will he seen to have probability approaching 1 under our as- 
sumptions, this particular assignment is irrelevant for asymptotic considerations. However, from 
a more practical point of view, one might want to use the objective function ^oiPn,j,r-_, — 

Pk.j,ri9))^Pn \ r,'^'^ instead, which clearly coincides with Qn,k on An. 

(ii) In principle, auxiliary estimators other than spline projection estimators could be used in 
the definition of Qn,ki9)- We do not pursue this in this paper but see Gach (2010). We note 
that standard kernel density estimators are inappropriate here because of boundary effects. 

An indirect inference estimator 9n,k '■= On.k.j,j,r,.r is now defined to be any measurable 
function that satisfies 

inf Q„,fc(0) = Q„,fc(^„,fc). (10) 
see 

For the sake of simplicity, we shall use the abbreviation Qn,k to denote Qn.k,j,j,r,,r as well as 
Qn,k,j„,Jk,rt,rj the prccisc meaning always being clear from the context. [A similar comment 
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applies to On.k, as well as to Qn and 6n defined later in Section 15.21 ] That such an estimator 
exists is shown in the next proposition, the proof of which can be found in Appendix [B] 

Proposition 1 Suppose Q is compact in M.^ and that the simulation mechanism p{v,-) is con- 
tinuous on Q for every u G V. Furthermore, assume that r, > 1 and r > 2 hold. Then there 
exists a ^ ® -measurable mapping On.k satisfying lilCfjI. 

Remark 3 (Computational issues) (i) As noted in Remark\^ only one sample o/ Vi,...,Vfc 
needs to he drawn before 7|j'*(0) can he evaluated for any arbitrary 9 £ Q via The computa- 
tional costs for evaluating 7^'^' {9) are trivial. 

(a) The evaluation of the objective function Qn,k{0) at an arbitrary 9 €z Q is not computa- 
tionally expensive either: Note that in view of the objective function Qn.k{0) can be written 
as a linear- quadratic form in the variables ^^fj {0) where the entries of the weight-matrix and 
the coefficients of the linear part are integrals of functions that do not depend on 9 ( and are 
simple functions of linear combinations of B- spline basis functions). Consequently, the integra- 
tions have to be done only once and the evaluation o/Q„_fc(0) then reduces to computation of the 
linear- quadratic form in the variables 'y'f}{9). 

(Hi) Minimization of Qn,k{9) over O is now a standard optimization problem and has a 
level of computational complexity comparable to computation of common (non-simulation-based) 
optimization estimators. Standard techniques like grid-search, Newton-Raphson-type procedures, 
or stochastic search procedures as in Beran and Millar (1987) can be applied. Similarly as in 
the case of non- simulation-based optimization estimators, it is in fact feasible to show that the 
estimators generated by such a numerical procedure have the same asymptotic properties as the 
estimator 9n,k under appropriate assumptions. 

We now introduce the following assumptions on the parametric model that will be used to 
prove the main result. 

Assumption PI: (i) The parameter space is a compact subset of R*". There exists 
a. 9q E Q such that pq = p{9()) A-almost everywhere. Furthermore, p{9) — p{9o) A-almost 
everywhere implies 9 — 9q. The mapping 9 t-^ p{9, x) is continuous on Q for every x G [0, 1]. The 
density p{9o) is positive on [0, 1]. 

(ii) Ve is a bounded subset of Bt- for some r > 1/2. 

(iii) 9o is an interior point of 8. There is an open ball B{9o) C Q with center 9^ such that the 
map 9 I— >■ p{9, x) is twice continuously differentiable on B{9o) for every x £ [0, 1]. Furthermore, 

/ sup \\Vgp{9,x)\\'^ dx < 00, / sup II Vgp(6', a:)|| dx < 00, 
Jo eeB{9o) Ja eeB{eo) 

and Vop{()o,x)Vep{9o,xyp{9o,x)~^dx is positive definite. [Here Vg denotes the gradient 
w.r.t. 9 written as a column vector and Vg denotes the matrix of second derivatives.] 

(iv) For some ^ > 1/2 

dp{9o, 



aoq 

holds for every q = I, ...,b. 



Assumption Pl(i) is a standard assumption that implies consistency of the maximum like- 
lihood estimator. In particular, it expresses the fact that the parametric model is correctly 
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specified and that the true parameter value is identifiable. Assumption Pl(iii) in conjunction 
with Pl(i) is a typical assumption used to establish asymptotic normality of the maximum likeli- 
hood estimator and the information matrix equality. Assumption Pl(ii) requires the parametric 
density functions to behave "regularly" as functions of x (uniformly in 9), the condition being 
quite weak: Note that if r is close to 1/2 the density functions are not even required to be 
differentiable, all that is required is essentially that the functions are "£^-Holder continuous" of 
order r, uniformly over 0. [Given compactness of 0, a sufficient condition for Assumption Pl(ii) 
is that Ve C B,- for some r > 1/2 and that the map 6 — >■ p{6) from Q to Bt is continuous; 
in fact, continuity of the map 9 — > ||p(0)||t,2 already suffices. A simple sufficient condition for 
this (with T = 1) is continuity oi 9 |b(^)||2 and 9 \\DwP{6)\\2 on 9, cf. ([5]).] In a similar 
vein, Assumption Pl(iv) imposes an analogous weak regularity condition on the derivative of 
p{9) (w.r.t. 6*) at 61 = 6^0. 

For parts of the main result we will need to supplement assumption PI by the following 
assumption. 

Assumption P2: (i) The set | ^g^g ' ■ Q = ^, ■ ■ ■ 6* e B{9q)^ is a relatively compact 
subset of £^ where B{9q) is defined in Assumption PI. 
(ii) The set { 

del'de^, • 9: 9' — 1, • ■ • J ^, ^ G B{9o)^ is a bounded subset of i.e., 

sup / II Vgp(0, x) da; < 00. 

eeB{0„) Jo 



These assumptions are not restrictive. For example. Assumption P2(i) is satisfied if the 
indicated set of functions is a bounded subset of a Besov space Bg with s only satisfying s > 0, 
which is a very weak condition. 

We also need assumptions on the simulation mechanism p. The basic assumption will be that 
the function p satisfies a Holder continuity condition in 9 (Assumption R(i)). For some of the 
results we shall need an additional assumption including twice differentiability in a neighborhood 
of 9q (Assumption R(ii)). 

Assumption R: (i) The function p is uniformly Holder in 9, more precisely, for some < 
L < 00 and some < a < 1 

snp\p{v,9) - p{v,9')\ <L\\9-9'r 

holds for all 9, 9' e 6. 

(ii) There is an open ball B{9q) C Q with center 9q such that the map 9 — p{v,9) is twice 
continuously differentiable on B{9q) for every u e V and 

sup ||V6(/9(t;, 6')|| < 00, sup |lVgp(w,6')|l < 00. 

t)6V,e6B(eo) tiev,6ies(eo) 

Furthermore, for some Q < L' < 00 and some < /3 < 1 

svi^\\\/lp{v,9) - \/Ip{v,9')\\ < L'\\9-9'\f 

holds for all 9, 9' e B{9o). 
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Assumptions on the parametric model Vq and assumptions on the simulation mechanism p are 
of course interrelated. For example, one could in principle only impose appropriate assumptions 
on p and then deduce the existence of a Vq with the required properties from those assumptions; 
see Gach (2010) for some discussion. However, as this does not seem to lead to a transparent 
catalogue of assumptions, we have chosen to formulate the assumptions in the form given above. 

We now first establish consistency of the indirect inference estimator. The assumptions used 
for the consistency result in the subsequent proposition are stronger than what is actually needed 
for such a result, but we do not strive for utmost generality in the consistency result as this is 
not the main focus of the paper. The proof is given in Section [5. II 

Proposition 2 Suppose Assumptions Pl(i),(ii) and R(i) are satisfied and that > 2 and r >2 
hold. // j„ — oo as n — 7- cx) and oo as fc oo m such a way that for some d > 1/2 we 

have sup„>i 2^"'^^^+''-yn < oo and supfc>i Jk2-^>'^'^^+'^^ /k < oo, then 



We note that the condition on j„ is, e.g., satisfied if 2^" ~ n'^ with < tp < 1/2. A 
similar comment applies to Jfc. In particular, the 'textbook'-choice ■0 ~ 1/(2t + 1) with t from 
Assumption Pl(ii) is covered. 

For the main result we need to distinguish several cases characterized by the behavior of the 
number k(n) G N of simulated data as a function of sample size n: 

Assumption SI: lini„^oo k(n)/n'^ = oo. 

Assumption S2: lim„_i.oo k{n)/n — oo. 

Assumption S3: lim„_i>oo k{n)/n = k for some < k < oo. 

The theorem given below is the main result and shows that, under appropriate conditions on 
the resolution levels j„ and J^-, the indirect inference estimator 9„^k is asymptotically normal and 
has the same limiting distribution as the maximum likelihood estimator provided the number 
k{n) of simulated data grows sufhciently fast as a function of sample size n. This is established 
under the quite weak assumption R(i) if k{n) grows faster than n^. If is only required to grow 
faster than n, the same result is obtained under somewhat stronger assumptions (Assumption 
R, T > 3/2, r > 4). Under the latter assumptions, the theorem also shows that in case k{n) 
behaves asymptotically like n, the indirect inference estimator is still asymptotically normal 
but its asymptotic variance covariance matrix is then inflated by a factor 1 + 1/k, where n — 
lim„_).oo k{n) / n. We also note that the condition r < r* Ar in the subsequent theorem is virtually 
no restriction as discussed in Remark 2] below. The proof of the subsequent theorem is deferred 
to Section [SJ 

Theorem 1 Suppose r >2 and r, > 2 hold and Assumption PI is satisfied for some 1/2 < r < 
n A r. Suppose that 2^" ~ n^/^^^+i) and 2'^M") ^ fc(n)i/(2T+i) _ 
a. Suppose one of the following two conditions holds: 

1. Assumptions R(i) and SI hold. 

2. Assumptions P2, R, and S2 hold, and that t > 3/2, r > 4 are satisfied. 
Then 



Qn.k ^ Oo 'in Pr -probability as n A k ^ oo. 




as n ^ oo 



where I{8o) — (Jq ^ep{9o,x)Vop{9o,xyp{9Q,x) ^dxj is the Cramer-Rao bound. 
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b. Suppose Assumptions P2, R, and S3 hold for some < k < oo, and that r > 3/2, r > 4 
are satisfied. Then 



We note that the rates of increase for 2^" and 2'^'=<") specified in the above theorem are 
precisely the rate-optimal choices based on mean integrated squared error. As already alluded 
to prior to the theorem, in Part a of the theorem there is a trade-off between the stringency of 
assumptions on the model and the simulation mechanism on the one hand and the assumptions on 
the rate of increase of k{n) (Assumptions SI versus S2) on the other hand. While the particular 
form of the trade-off is a consequence of two different methods of proof employed for Part al and 
Part a2 (and thus may in principle be an artefact), it seems plausible that some sort of trade-off 
is intrinsic to the problem. 

Remark 4 (i) The condition t < r^, A r in the above theorem is not really a restriction on Vs 
and can always be achieved in the following sense: If Assumption PI holds with r > A r, it 
holds with T replaced by any t' satisfying 1/2 < r' < A r as well, since Br is continuously 
imbedded in Bt' for t' < t. Consequently, the above theorem can be applied with t' replacing r 
(requiring also t' > 3/2 for Parts a2 and b). [The restriction t < r,, /\ r in the theorem simply 
expresses the fact that the rate of increase of j„ a^^c? is not only governed by the degree of 
"regularity" r of the densities in Vs, but also by the degrees of "regularity" of the splines used 
to estimate po andp{9), respectively, i.e., by r^, andr.] 

(a) The argument underlying (i) also shows that 2^" ^ vt^l^^'^ '^^'^ and 2'^''(") ^ k{ny^^^'^ '^^^ 
are feasible in Theorem{l\as it stands as long as 1/2 < r' < t (and r' > 3/2 for Parts a2 and b) 
are satisfied. A careful examination of the proof shows that the range for 2^"- and 2'^'=("', under 
which the conclusion of the theorem holds, is actually somewhat wider. However, we abstain 
from providing such results as they quickly get unwieldy. 

(Hi) If in Part a2 of Theorem[l\the Assumption S2 is strengthened by assuming a particular 
growth-rate for k{n) such as, e.g., k{n) = , 1 < 5 <2, this can be used to relax the assumption 
T > 3/2. We refrain from presenting such results. 

(iv) Ifk{n) is such thatO < liminf fc(n)/n < oo, but lim. sup k{n)/n = oo, then the distribution 

(^^n,k{n) ^ ^0^ does not possess a limit, but 'oscillates' between accumulation points of the 

form N (0, 1(6 o)) and N (0,(1 + 1/k)I{6o)) where now k = lim iuin^oo k(n)/n. 

(v) A result similar to Part al of Theorem]^ can be proved in case r* — 1. Since this requires 
a separate proof, we do not give such a result for the sake of brevity. 

Under Assumption PI the expression ^r(6l) = ^^Vep{0)Vep{0)'p{OT^d\ depends continu- 
ously on by dominated convergence. Hence, '^{0)~^ is a consistent estimator for /(^o) for 
every consistent estimator 0. However, this observation is not very helpful in the context of 
indirect inference as then expressions for the density p{9) are typically not available. An alter- 
native consistent estimator that is feasible to compute is described in the next proposition which 
is proved in Section 15.51 In the following proposition let 9n,k stand for an arbitrary consistent 
estimator that depends on the original data and perhaps also on the simulated data. Of course, 
under the assumptions of Proposition [2] we may take 9n^k — On,k- 

Proposition 3 Suppose Assumptions Pl(i)-(iii), P2(i), and R(ii) hold. Suppose further that 
9n,k — >■ ^0 probability as n A k ^ oo. Assume r'^ > 2 and r' > 3. If j'n ~> oo as n oo and 




as n ^ oo. 
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J'f. ^ CO as k ^ (X) in such a way that for some 5 > 1/2 we have sup„>]^ 2^'^'^'^^^^^ /n < oo and 
also Jfe23"'^ /k 0, then 

is well-defined on an event that has probability converging to 1, and is a consistent estimator for 
I{6o) as n A k ^ co. 

Observe that the condition on j'^ is satisfied if 2-'" ^ with < ?A < 1/2; similarly, the 
condition on is satisfied if 2'^'^ ^ with < V' < 1/3- The reason for allowing r' to differ 
from r in Theorem [TJ is to be able to construct a consistent estimator for I{0o) also in cases 
where r = 2. Allowing to be different from has the advantage of avoiding a constraint on 

T. 



4 Dyadic Splines 

Let Tj = {ti :— 12^^ : Z = 1, . . . , 2-' — 1} be a dyadic set of knots in [0, 1], where j E N, the set 
of nonnegative integers. A function S* : [0, 1] — >■ M is a (dyadic) spline of order r > 2 if on each 
of the intervals [0, ti), (t;, for I = 1, . . . , 2-' — 2, and (t2j-i7 l]j it is a polynomial of degree 
not larger than r — 1, and on at least one of the intervals it is a polynomial of degree exactly 
r — 1. The Schoenberg spaces Sj{r) considered here consist of all splines of order less than or 
equal to r that are r — 2 times continuously differentiable on [0, 1] (using one-sided derivatives 
on the boundary of [0, 1]). For r = 1 we define the Schoenberg space 5^(1) to be the space of all 
functions S : [0, 1] — > M that are constant on the intervals [0,ti), [ti,ti^i) for I ~ 1, ... ,2^ — 2, 
and [t23~i, !]• The Schoenberg spaces are linear spaces of dimension 2^ + r — 1. For r > 2 the 
B-spline basis for Sj{r) is given by {N^j^ : ^ = — r + 1, . . . , 0, 1, . . . , 2^ — 1} with 

N^p (x) = Ar(''' {2^x - I) for xe[0,l], 
where N^'^'^ is the B-spline-function (of order r) given by the r-fold convolution 

N^'^'>{u) = l[o,i) * ... * l[04)(w) for ueM.; 
cf., e.g., Chapter 5 in DeVore and Lorentz (1993). In case r = 1 we set 

N^p{x) = N'^^\2^x - I) for X e [0, 1], 
for / = 0, 1, . . . , 2^ - 2, where iV(i)(u) = l[04)(u), but we set 

Njfix) = l[04](2^x - I) for x G [0, 1] 

(r) 

if ^ = 2-' — 1. The B-spline basis functions iV^ are nonnegative, bounded by 1 in absolute value, 
and form a partition of unity, i.e., 

2^-1 

Nlp{x) = l for xe [0,1], (11) 

l=-r+l 

for every j, r G N. 
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TT^P from onto Sj (r) is given by 



The Schoenberg space Sj{r) is a finite-dimensional linear subspace of >C'^. The ortho-projection 



-r-l-l 



where 



= -r+l 



and g^^^'"^ is the m)-element of the inverse of the (2^ + r — 1) x (2^ + r — 1) matrix 



G 



('■) 



u — m)dv 



(r) 

Note that G^- is a symmetric bandmatrix with bandwidth r. The projection can now also be 
written as 



with the kernel given by 



(12) 



2^-1 



-r-l-l 171= — r+1 



We shall frequently need to bound the maximal row-sum of the absolute values of the elements 



of the inverse of G^^\ i.e., the £°°-operator norm of the inverse of Cj' ' . For this we use the 
following special case of a result in Shadrin (2001, Theorem I and Section 4.2). 

Proposition 4 For every r G N there exist constants < < oo (independent of j) such that 
for every j G N 



(r) 



where ||-||oo^oo denotes the -operator norm on 



< dr 
2-'+r-l 



We furthermore note that for r > 2 the Schoenberg space Sj{r) is contained in the Sobolev 
space of order r — 1, and thus is also contained in B^-i . In fact, for every r > 1 we have that Sj (r) 
is contained in for s < r — 1/2 (DeVore and Lorentz (1993), Chap. 12, Lemma 3.1). Some 
approximation properties of splines that we shall use in the sequel are summarized in Appendix 

m 

For the spline projection estimators defined in Section [3] we make the useful observation that 
for every J > 1 and r > 1 

lbfe„7.r(0)IL<2'^d,(2'' + r-l) (13) 

holds uniformly in 6* G 0, A: > 1, and vi^ . . . ,Vk G V. [To see this note that the B-spline basis 

<2^dr uniformly 



functions are uniformly bounded by 1 and that the coefficients satisfy 

in 6* G 8, A: > 1, — 7' -|- 1 < ^ < 2"' — 1, and wi, . . . , Ufe G V by Proposition 21] The analogous 
relation is true for ||p„,j,r. Ho^, as weU as for \\Epk.j,r{0)\\^ and ||i?p„j,,.J|^. 
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5 Proofs 



We shall use repeatedly in this section the fact that i^q := mf^^[Q i^p(9o,x) > under Assump- 
tions Pl(i),(ii) (as p{9o) is continuous and positive on [0, 1] under these assumptions). 



5.1 Proof of Proposition [2] 

Define the function 

Q{e) = 



{p{6^) ~ p{e)fp-\e,)d\, 



(14) 



which is real-valued and is continuous in 6 by dominated convergence, observing that > 
and that Assumption Pl(ii) implies sup- norm boundedness of Vq in view of the discussion 
following Proposition [7] in Appendix [X] The unique minimizer of Q{9) over O is in view of 
the identifiability assumption made in Assumption Pl(i). To establish consistency, it is hence 
sufficient to prove 

sup|Q„,fe(0)-Q(0)| ^0 
eee 

in probability as n A fc — >■ oo. Note that this supremum is measurable as Qn,k{S) and Q{0) 
are continuous and Q is separable. [For continuity of Qn,k see the proof of Proposition [1] in 
Appendix [Bl] Consider the set A* — {inf j,g[o,i] Pn,j„,r. (y) > Co/2}i which is clearly measurable. 
Since > as noted above. Corollary [2] (applied with t = 5 /\ t /\1 and noting that p{9q) is a 
continuous version of po in view of Assumption Pl(i)) implies that Pr(A*) — >■ 1 as n — >■ oo. A 
simple calculation now shows that on the event A* (since A* C An) 



Ja 



1 - 



{Pk,j,A0)-p{0)) 



■ Pie) 



Pn 



d\+ 1 {pk.j,A0)~pm'p-:,„,r, 

dX 



holds. On A* we can then obtain the bound 



SUp\Qn.k{0)-Qm 



< 



\\Pn 



-Pi0o)\\ 



1 



2eo'sup|b(f?)|| 
fee 



+2^0^ sup \\pk,j,Ad)-p{0)\\ 
eee 



+ SUp\\pk.J„r{0)-p{9)\\ 

eee 



-4Co^sup|b(0)|| 
eee 



The sup- norm boundedness of Vq together with Corollaries [T] and [2] (applied with t — 6 At Al) 
then complete the proof. 



5.2 An Intermediate Result 

Consider the objective function 

Qnie) := Qn , r{0) = l ^0 (Pn,^^r, " piO))' p'^^dX On the CVCUt A„ 

\ otherwise 

corresponding to the 'ideal' case k — oo. Let 0„ :— 9n,j,r, denote an arbitrary measurable 
minimizer of (1151) over Q. [The existence of such an estimator is established in Proposition 1101 in 
Appendix [B]] 
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Theorem 2 Suppose > 2 holds and Assumption PI is satisfied with 1/2 < t < r*. // 

2^1 n^/^'^'^^'^\ then, as n oo, 

V^[k~Oo) iV (0, /(0o)) • 

Proof. Consistency of 9n follows from Proposition [TT] in Appendix [B] by choosing 6 in that 
proposition sufficiently close to 1/2. It follows that 9n G B{6o) with probability tending to 1, 
and hence 9„ belongs to the interior of O with probability tending to 1. In the following we work 
only on the intersection of the event e B{9q)^ with A* ~ {infye[o,i] Pnjn,'* (y) — ?o/2} 
which also has probability converging to 1 as a consequence of Corollary [2] (applied with some 
t satisfying 1/2 < t < t Al). Note that |bn,j„,r. Ilo^ < °° holds, and that ||Pnj„,r. — 2/Co 
on the event . Furthermore, by Assumption Pl(ii) the function p(9) is bounded, uniformly 
in 9, cf. Proposition [7] and the attending discussion in Appendix [K\ Assumption Pl(iii) and 
dominated convergence then show that Qn{9) is twice continuously differentiable on the open 
ball B{9o) with derivatives given by 

VeQn{9) =-^l iPn,J^,r, " p(^) ) P,Ti V^P ^dA, 

V^g„(0)^2/ p;^\^^^^yep{9)yep{9)'d\~2 ( {Pn,j„,r,-p{9))p-!j^,rylp{9)dX, (16) 
Jo Jq 

and these derivatives are measurable functions for every 9 € B{9q). Since 0„ is an interior 
maximizer of Qn (on the event considered), we have that VeQ„(0„) = 0. Consequently, a 
standard Taylor expansions gives 

= VeQnik) - V(,g„(eo) + ^sQUk - 9o), (17) 

where the i-th row of VgQ* equals the corresponding row of V gQn evaluated at a mean- value ^„ 

which may depend on the row- index (measurability of 0„ being no concern here). We now first 
establish that n^^'^V gQn{9t:,) is asymptotically normal with mean zero and variance-covariance 
matrix 4 j^SI gp{9o)V ep{9o)'p^'^{9Q)d\. To this end write {-l/2)n^/'^V eQn{9o) as 

/ (P«j„,r. - ^(6*0)) p{9q)-^V ep{9o)d\ + 

V^J^ {Pn.i.,r. - pi9o)) {p;X,r. ^ piO^r^)"^ ep{9o)d\ 

both terms being measurable. The first term in the above display now converges to the required 
limit by Theorem 2] (applied with t = t, and some s satisfying 1/2 <s< I, s < At) and 
the Cramer-Wold device: To see this, observe that po G Bt by Assumption Pl(i),(ii) (since 
Po — p{9o) A-a.e.). Furthermore, for every a G M.^, a 7^ 0, the function / — p{9o)^^a'Wgp{9Q) 
belongs to B^^r as a consequence of Assumption Pl(ii),(iv) and Proposition [7] in Appendix El 
Hence T = {/} C Bg. The conditions on j„ in Theorem 2] follow from the assumption on j„ in 
the current theorem. Finally note that P{f) = under Assumption PI. The second term in the 
above display is bounded in norm (on the event A* ) by 

"'^'i -p(^o))'p(f?o)-V;i,.. \\Vep{9o)\\dX 

< (2/^0) sup \\Vepi9o,x)\\n'/^\\p,,,,^,r,-p{9o)\\l, 

xG[0,l] 
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noting that ||p(6'o) ^W^x, — ^o^i ^^'^ thai -^p{9o) is bounded on [0,1] for every q since it 
belongs to with > 1/2 by Assumption Pl(iv). By Lemma |3] the r.h.s in the above display 
is Op(n-i/22J,. _^ jji/22-2j„r>) jg Q^^-^) because of r > 1/2. 

Next we show that VgQ* converges to the positive definite matrix VgQ{Oo) in (outer) prob- 
ability. To this end we first show that VgQn{&) converges to VgQ{9) uniformly over B{9q) in 
probability where Q{9) has been defined in (|14p . By Assumption PI and dominated convergence 
we have that Q{9) is twice continuously diffcrcntiable on B{9o) with 

VlQ{9)^2 r p{9o)-^Wep{9)Wep{9ydX~2 f\p{9o) - p{9))p{9o)-^Wlp{9)dX. 



We now see that 

v2g„(0)-v2Q(0) 



iPni,.r-pi0o)-')^ep{e)Wgp{9y 



+ 2 / {p{9o)~Pn,j„,r,)pi9o)-'yip{0) 



{Pn,,^,r, -P{9)) ip;X,r,-Pi(^0)-')Vlp{9) 



and we obtain (the supremum being measurable because of continuity of Vg(5„ and V^Q on 
B{0o)) 



sup \\ylQnie) - vlQi9)\ 



(18) 



< 2||p„,,„,,. -p(0o)||, 



sup 

9eB(0o) 



?'ni.r..P(^o)-M|Vsp(0)irdA 



+ 



Kj„,r. -p(e)|p;i,,.p(^o)-' \\ylp{9)\\dX+ / p{9o)-' \\Vlp{9)\\dX 



< 



■P{0o)\\ 



ACo^ f sup \\\/gp{9)\\'dX 
Jo 0eB{9o) 



+ Ueo^M sup \\pmL] + 

\ \ eeB{0a) I 



2^0 f sup \\wlp{9)\\dX -o, 
J Jo eeB(eo) 



(1), 



by Assumption PI and Corollary[2] (applied with a t satisfying 1/2 < t < r A 1). Since Vg(5(0) 
is continuous at 9q as shown above and since ^„ is consistent, convergence of VgQ* to V^Q{9a) 
in (outer) probability follows. 

The central limit theorem for the score together with the convergence result for V^Q* just 
established delivers now the desired result: rewrite (|17p as 



Q = n^'^VgQn{9o)+VlQ{9o)n^ 



9o) + (V^g; - WlQ{9o)) n^l\9^ - 9„), 



observe that V'^Q{9q) is positive definite by Assumption Pl(iii), and that the third term on the 
r.h.s. is of lower order than the second one. This implies that n^/2(^„ — 9q) is stochastically 
bounded, and the desired result then easily follows. ■ 

For the same reasons as given in Remark HI the condition r < r* in the above theorem is not 
really a restriction. Furthermore, examining the proof shows that the conclusions of the theorem 
also hold for other choices of 2^": e.g., the theorem (without the condition r < r,) holds for 
2^" - n"" with V satisfying 1/ (2 ((t A r,) + (c; A r A 1))) < < 1/2. 
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5.3 Proof of Part al of Theorem [H 

We first provide an auxiliary result that relates the objective function Qn,k{0) to the somewhat 
simpler objective function Qn{Q) studied in the preceding section. Note that k is not linked to 
n in the subsequent proposition. 

Proposition 5 Suppose r > 2 and r, > 2 hold and Assumptions Pl(i),(ii) are satisfied for some 
1/2 < T < Ar. Suppose further that Assumption R(i) is satisfied and that 2^" ^ 77,1/(2^+1) 

^ fc^/(^'^+^). Then for every e > there exists a positive real number M{e) and a natural 
number N(e) such that 



Pr ( sup \Qn.k{d) - Qnm > Mie] 
eee 



< e 



(19) 



holds for all n > N{e) and all k > 1. 



Proof. First note that the supremum in (IT^ is measurable since Qn,k{S) and Qn{0) are con- 
tinuous in 9 as noted before, cf. Section 15.11 For given e > choose N{e) large enough such 
that for n > N{e) we have Pr(A*) > 1 - e where A* = {infye[o,i] J'raj„,r. (j/) > ^0/2} • This is 
possible by Corollary [21 A simple calculation shows that on the event A* 



QnA^) ~ Qn{0) = / (pfc.J„,(0) -p(^)) 

Jo 



Pk,j,A^)+pi0) 

Pn,jn 



holds. Choose s to satisfy 1/2 < s < r A 1. Applying Corollaries [T] and [5] (with t = s) shows that 
for the given e > there exists a positive finite D such that the events 



A* 



sup \\Pk^Jk.r{0)L 2 < D, \\p, 



lee 



< D 



have probability not less than 1 — e for every k > 1 and n > 1. Applying Proposition [7] in 
Appendix [XI we conclude that there exists a finite positive D', depending only on D, S^q, and 
supggQ I1p(^)|Is 2 (which is finite by Assumption Pl(ii) and continuous embedding of in B^), 
such that on At f] A**^. 



sup WiPk^A^) + PW)Pn!j,,,r, - 

eee 

holds. Thus for every M > 0, all fc > 1, and all n > N{e) 



2\\ <D' 

lis, 2 — 



Pr K/fcsup |Q„,fc(0) ~ Qn{e)\ > M 
\ eee 



sup sup 

VI eee ll/IU,2<-D' 



{Pk,j,A0)-pmfdx 



> M} nAln A, 



n,k 



2s 



< Pr( \Vk sup \\Pk J, A0)-PiS)\\^> m\) +2e 



eee 



where J- denotes {/ G B^ : ||/||s,2 < D'} and \\-\\jr is defined before Theorem[31 Choose an s' 
satisfying 1/2 < s' < s. Then Theorem [31 (applied with t — t) implies for every fc > 1 

V^sup \\Pk,j,Ad) - Pio)y < Aksnp \\Pk,.i,As) - PkWy + %/fcsup \\Pk{e) - p{e)y 



9ee 



eee 



see 



= Op (Vfc2-'^''(^+") + 2^'^''(''-"') + 1) = Op(l). 
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[Measurability of the suprema on the r.h.s. in the first line of the above display is established 
in the proof of Theorem |31 The argument given there also establishes measurability of the 
supremum on the l.h.s.] This completes the proof (noting that the l.h.s. in the above display is 
certainly a real-valued random variable for every k). ■ 

The closeness of Qn,k and Qn expressed in the previous result translates into closeness of the 
minimizers of these functions with the help of the following simple but useful lemma which is 
taken from Gach (2010). Note that M2 below is smooth but Mi need not be so. This is relevant 
as Qn,k is not guaranteed to be smooth under the assumptions of Part al of Theorem[l] whereas 
Qn is in view of Assumption PI. 

Lemma 1 Let U be a nonempty convex open subset o/R''. Suppose we are given functions 
Ml : [/ — > R and M2 : C/ — ^ M, such that M2 is twice partially differentiate on U with Hessian 
satisfying 

iniy'VlM2{x)y>c\\y\\' (20) 

for every y G and some < c < 00. // mi G U and m2 G U minimize Mi and M2 over U, 
respectively, we have 

||mi - mall < 2c"i/2 /sup |Mi('u) - M2{u)\ 
y ueu 

where \\-\\ denotes the Euclidean norm on M.^ . 

Proof. Assume that minimizers toi and m2 exist, since otherwise there is nothing to prove. [By 
convexity of U and the assumption on the Hessian the minimizer TO2 is unique.] Since m2 is a 
minimizer of the twice partially differentiable function M2 on the convex open set U, we have 

M2(mi) = M2(m2) + 2'^{mi - m2)'V^M2(m)(mi - m2) 

(using a pathwise Taylor series expansion) where m lies in the convex hull of {m,i,m2}. We 
conclude from the assumption on the Hessian that 

\\mi ~ m2\\ < (2c-i)i/V|A'^2(mi) - M2(m2)|. (21) 

Observe next that 

Mi(mi) - M2{m2) < Mi(m2) - M2(m2) < sup |Mi(u) - M2{u)\ 

and 

Ml (mi) - M2(m2) > Mi (mi) - M2(mi) > - sup |Mi(m) - M2(w)| 

ueu 

so that 

|Mi(mi) - M2(m2)| < sup |Mi(m) - M2(w)|. 

Consequently, 

|M2(mi) - M2(m2)| < |M2(mi) - Mi(mi)| + |Mi(mi) - M2(m2)| < 2 sup |Mi(u) - M2(w)|, 

which, when plugged into (j21[) . proves the lemma. ■ 

The proof of Part al of Theorem [T] is now as follows: Let U C B{9q) be a sufficiently small 
open ball around such that the smallest eigenvalues of V^Q{9) are bounded away from zero 
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by a positive constant, 77 say, uniformly in 9 (£ U. Such an U exists, since Vg(5(0) is continuous 
on B(9q), as shown in Section [Ol and since (3(6*0) is positive definite by Assumption PI. 
Now apply Lemma [T] with Mi = Qn,k{n)i M2 — Qn, and the set U just mentioned. Note that 
condition (|20| is then satisfied for M2 = Q„ and c = 77/2 on an event En that has probability 
converging to 1 in view of the choice of U and since it was shown in the proof of Theorem [5] that 
VlQn{d) converges to VIQ{9) uniformly on B{9o) in probability. Observe also that Proposition 
[S] implies 

sup \QnMn)iO) - = Op{k{n)-'/^) . 



9ee 

Taken together, this implies 



,fc(n) 



Op(fc(n)-i/4), (22) 



which is Op{n ^/^) in view of Assumption SI. Part al of Theorem[l]now follows from asymptotic 
normality of y/n ( 0„ — 6*0 ) which has already been established in Theorem [21 



5.4 Proof of the Remaining Parts of Theorem [T] 

Observe first that it suffices to show that every subsequence Ui of n contains a further subsequence 
ni(/) along which the claimed asymptotic normality result holds. Given n.^, we may choose the 
subsequence n^i^ in such a way that lim/_j.oo ^("■i(/))/"^i(;) exists (possibly being 00) since the 
extended real line is compact. But the sequence A:(ni(/)) can be viewed as the subsequence fc(ni(;)) 
of a sequence k(n) for which lim„^oo k{n)/n'^ exists (and necessarily equals limi__>oo ^("i(o)/'"'^(i))- 
This shows that for the proof we may assume without loss of generality that lim„^oo k{n)/n? 
exists (possibly being 00). In the case where this limit is infinite, the results then follow from 
Part al which has already been proved in Section 15.31 Thus we may assume without loss of 
generality not only that the limit of k{n)/'n? exists, but also that 

lim k{n)/n^ < 00. (23) 

n— >oo 

We shall make this assumption for the remainder of this section. 
Under Assumption R and if r > 4 the mapping 

2'^ — 12'^ — 1 / k \ 

l=-r+lm=-r+l \ i=l / 

is twice continuously differentiable on B{9q) for every y and every realization of Vi, . . . , Vfe by 
the chain rule. Similarly as in the proof of Theorem [21 it suffices to work only on the event 
^ |^n,fc(n) G B{9q)^ which has probability converging to 1 in view of Proposition [21 (applied 
with 5 > 1/2 sufficiently close to 1/2) and Corollary [21 (applied with some t satisfying 1/2 < i < 
r A 1). Note that ||p(^'o)~^ — Co^ ^^"^ that \\p~]j — 2/Co holds on the before mentioned 

TTTTl (r) 

event; we shall use these facts repeatedly in the sequel. Using this, (IT3|) . boundedness of N^^'j 
and of its first two derivatives as well as Assumption R, one concludes from the dominated 
convergence theorem that also the objective function Qn,k defined in ^ is twice continuously 
differentiable on the neighborhood B{9o) with derivatives (measurable for every 9 E B{9q)) 

^eQrukiS) = (P«j„.r. -Pk,.}^AS))Pn],^^ryePkJ,A^)d\ 
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L 



Since 0n.k{n) is an interior maximizer of Qn.k{n) (on the event considered), we clearly have that 
9Qn.k(n){(^n.k(n)) = 0. Consequently, a standard Taylor expansions gives 

= V9Q„^fc(„)(^„^fc(„)) = V6/Q„,fc(„)(0o) + VgQ* ;j,(„-)(0„^fc(„) -^o), (25) 

where the i-th row of V^Q* equals the corresponding row of VgQ„ fe(„) evaluated at a mean- 

value 0„ which may depend on the row-index (measurability of the mean-value being of no 
concern). We next show that \/nV gQn^kinji^o) asymptotically normal and that VgQ* ^.j-^^ 
converges in (outer) probability to the positive definite matrix VgQ(^o)- The asymptotic nor- 
mality of \/n (^n,fe(n) ^ ^o) then follows along the same lines as in the last paragraph of the 



proof of Theorem [2j 

Step 1: CLT for the score v^^e2n./c(n) (^o)- 
We decompose the score as follows: 

VeQ„,fc(„)(^o) 



1 



= -2/ {Pn.j^.r, - p{eM&or ^ ep{Oo)d\ 



1 



+2 / {Pk(n).j,,^,A^o)~piGo))p{eor^Vep{eo)d\ 

Jo 

+ 2 / (Pnj„,r. -Pfc(«),Jfc(„),r(^'o)) (p(^'o)"^ Vep(6'o) " P,Tj„ ,r. '^9Pfc(n) , J^") (^")) 



dX 



I + 11 + II L 



with each of the terms being measurable. We further observe that the terms / and // are 
independent by construction of the simulation mechanism. 
About Term I: As shown in the proof of Theorem [2] 

A^I ^'^ N{0, T.) 

where 



E = 4 / Vep(0o)Vep(0o)V(^o)-'dA. 
Jo 



About Term II: Exactly the same argument as given in the proof of Theorem [5] for term /, 
except for using Theorem [3] instead of Theorem |4l establishes that 



y/kin)II N (0, E) . 

But then 

V^II = ^n/k{n)^/k{n)II -^'^ N {q, 
under Assumption S3, and AnII converges to zero in probability under Assumption S2. 
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About Term III: By Cauchy-Schwarz and the triangle inequality we have the bound 



< 2 



Pn,j„,r, -Pfe(„),J,(„),r(6'o) [(2/Co) II (P' 



2 



-p(0o))Vep(0o)|| 



-(2/eo) VePfc(„),j,, ,,.(0o) - V0p(0o) 



< (4/^o) lbn,i„,r. -^(6*0)112 + P(^o) " Pfc(n) ,.7fc(„) ,r (^'o) 



a/^o)\\Pn,J„,r,-pi0o)\\,\\yop{0o)\\ 



^ePk{n).Ju„)A^o) ~ Vep(6'o) 



with II V6ip(^o)|loo being finite in view of Assumption Pl(iv) and Proposition [7] in Appendix [Xl 
The r.h.s. of the above display is now 




k{n) 



for every 0<s<r, s<<;in view of Assumptions PI and R as well as Lemmata |3] and 
Fixing such an s > 1/2, the expression in the above display is seen to be Op{n~^/'^) under the 
assumptions of Part a2 or Part b (in particular, r > 3/2), showing that ^/nlll is asymptotically 
negligible. 

This completes Step 1 and shows that 

V^VeQ„,fe(„)(0o) N (0, (1 + K-^)Y) 

under the assumptions of Part b, whereas under the assumptions of Part a2 

V^V,Q„,fc(„)(^o)^''^(0,S). 



Step 2: Convergence of second order derivatives. 
We have 



n,/c(n) 



< 



fc(ri) 



where VgQj^ is the matrix VgQn row- wise evaluated at the mean- values j,(„). In view of (fH 

consistency of 9n,k(n) j and continuity of VgQ at 9o, the second term on the r.h.s. above converges 
to zero in (outer) probability. We now show the same for the first term on the r.h.s. in the above 
display: Note that the argument leading to (1^ is also valid under the current assumptions, and 



therefore we can conclude from ((22|) , (j23p , and Theorem [2] that 
Consequently, it suffices to show that 



Op(fc(n)-i/4). 



sup 

9e-B(eo), lie-So II <AfA:(n) -1/4 



|V^Q„,fc(„)(0)-V^Q„(0)|| ->0 
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in probability for every < M < oo, the above supremum being measurable (as the functions 
involved are continuous). Now, by ([21)) and dH 



L 

(pI^*) -Pfc(«),Jfc(„),r(6'))P,;j„,r.VePfc(„),J„„),r(6')rfA 

PnX,r. {^ePkin)j,^„,AS)^0PHn),M„,ASy ^ ^ ep{0)S/ epiO)') d\ = I-II + III. 
About Term I: By the Cauchy-Schwarz and the triangle inequalities 

||/|i < 2CoMlb«J„.^. -p(^0)|l2 + IW^0)-p(e)|l2] X 

The first term on the r.h.s. of the above display is Op(n^'^/('^'^+^') in view of Lemma [3] 
and the choice of j„. For the second term, observe that in view of Assumption Pl(iii) we have 
p{9,x) —p{9q,x) — \I ep{9{x)^x)' {6 — ^o) by the pathwise mean value theorem, and hence 



M9o)-p{9)\\2<\ I sup \\Vep{9.x)fdx\ 
\Jo eeB(eo) ) 



1/2 



\9~B^\\^0{\\B~9^\\) 

holds for all 9 E B{6q). In view of Lemma[5]and the choice of Jk{n), the supremum over B{9q) 



of the third term is Op{k{n 



|(2-r)/(2T+l) 



\/log kin)). Furthermore, note that 



(26) 



holds for 9 e B{9o). [This is proved analogously as ([M]) in SectionlHl making use of the dominance 
assumptions on Vgp in Assumption PI, the uniform boundedness assumption on the derivatives 
of p in assumption R(ii), the boundedness of the B-spline basis functions and their first two 

derivatives (as r > 4 holds), as well as using that gg'^gg^ G in view of Assumption P2(ii).] 
The above established relation, together with the fact that the spectral matrix norm is bounded 
by the Frobenius norm, implies that the supremum over B{9q) of the fourth term is bounded by 



sup J2 



d9id9^' 



d^p{9) 



< sup \] 

2 eeB(eo),,,=i 



d^p{9) 



d9ide,, 



< oo 



the last inequality following from Assumption P2(ii). Consequently, in view of (j23p . 

sup ||/|| 
6(eB(eo),||e-eo||<Affc(n)-i/4 



< 



Op(n-^/(2-+i))+0(fc(n)-i/4) Op(fc(n)(2—)/(2-+i)0ogfc(n))+ const = Op(l) 
under either the assumptions of Part a2 or Part b (since r > 3/2 > 4/3). 
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About Term II: By the Cauchy-Schwarz and the triangle inequahties 



sup ||//||<2Co^ sup P{0) - Pk{n),J^,„uriS) 

eeB(0o) oeBiOo) 



sup 

eeB(e„) 



= Op(fc(n)-"/(2r+i)0Qg^(^)) Op(fc(n)(2-")/(2-+i) ^log fc(n)) + const 
where we have made use of Lemmata [3] and [5] and we have used the bound 



(27) 



sup 

eeB(eo) 



EVlpk(„),J^,„,r{d) < sup V 
2 eeB(9o) ,7^1 



dose,' 



< oo 



which follows from I^E^ and Assumption P2(ii). The r.h.s. of (P7)) is now Op{l) since r > 3/2 > 1. 
About Term III: By the Cauchy-Schwarz and the triangle inequalities 

||///|| < 2^0^! V0Pfc(„).j,, , ,(0) - VepW [ VePfc(„).j,, ,,.(0)-V(,p(^) +2\\\/epm\ 



Now 



sup 

eeB(,0„) 



by Lemma [S] and since r > 3/2 > 1. Furthermore, 

dpk(n),j,^,^^As) dp{e) 



sup 

96B(eo) 



E\/ePk{n),j,,^,,M-^ep{e) 



< V sup 

6 



E sup 

0GS(eo) 





do. 


fdp{9)\ 


dp{9) 







the last equality holding as shown in p9p in Section [B] By Proposition [5] in Appendix \K\ and As- 
sumption P2(i) the r.h.s. in the above display is now o(l). Taken together, this provides a bound 
for s\i]ygfzB{eo) ,\\e ~eQ\\<Mk(n)-^/ ll-^^^ll which converges to zero in probability. This completes the 
proof of Step 2. 



5.5 Proof of Proposition [3] 

Since 9n,k "> by assumption, since $(0) V gp{6)V ePiO)' piOo)^^ d\ is continuous on the 

neighborhood B{9o) of 9o by dominated convergence and Assumption Pl(iii), and since $(0o) 
is positive definite by the same assumption, it suffices to show that, uniformly over B{9q), the 
expression $(6*) = ^ BPk.,j'^,r'[Q)^ ePk,j'^.r'{9)'Pn\' r' converges to $(6') in probability as 
rt A — > oo. Note that ^{9} is well-defined on the event yl* which has probability converging to 
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1 in view of Corollary [21 In the sequel we only work on that event. Now 



$(6*) ~ $(61) 



< 



dX 



^ePkJiy{0)^ePk,Ji.r'{Oy - ^ ep{e)V ePiO)' ) p-^^.^ydX 



< 



[ sup \\Wep{e)fdX 
Jo eeB(eo) 



+2^0^' W^ePkj^r-ie) - Vep(0)|l2 [\\^ePk.,j,,r'{e) - ygpm\2 + 2 \\^epm\2] ■ 

The first term on the r.h.s. is independent of 9 and converges to zero in probability by Corollary 
[21 The supremum over B{9q) of the second term converges to zero by essentially repeating the 
argument that has been used in the very last step of the proof of Theorem [1] 



6 Rates of Convergence for Spline Projection Estimators 

This section contains the main stochastic bounds used to control remainder terms in the proofs 
in Section [5l We first collect some simple facts about the B-splines iV^''^ that will repeatedly be 
used in this section: 



< 1, 



1, 



< 1 for r > 1. 



(28) 



The first relation is a direct consequence of the definition of N'''^\ the second one follows since 
N(r) is ~ as a convolution of probability densities ~ a probability density again, and the third 
relation is a consequence of Young's inequality. Furthermore, it is easy to see that N'^^'^ is 
continuously differentiable for r > 3 with derivative N'^'^^' given by 



]\[{r)' ^ _ 7V(''-1)(. - 1). 



(29) 



For r — 2, the B-spline A^'-^-' is Lipschitz and only has a weak derivative TV^^^' which, in order to 
have it defined everywhere, will always be taken as A^'-^-' — A^'^^^(- — 1). The bounds 



< 1, 



7V('-)' 



< 2, 



iV('-)' 



< 2 for r > 2 



(30) 



are then an immediate consequence of (|28)) . (|29)) . and the fact that N^"^^^^ is nonnegative. By 
repeated application of (j29p we can obtain bounds for higher-order derivatives, for example, we 
shall need 



< 2, 



< 4 for r > 3, and 



< 4 for r > 4. 



(31) 



The above discussion also implies that iV'^'') for r > 2, N'^^'>' for r > 3, and A^^*")" for r > 4 are 
globally Lipschitz on M with Lipschitz constants bounded by 1, 2, and 4, respectively. 

For / e Sj{r), r > 3, we denote in the following by /' its derivative (using one-sided deriva- 
tives on the boundary of [0, 1]); for r = 2 we use /' to denote the weak derivative. 



Lemma 2 Let f — Ym=-1-+i '^'■■^ij' ^^^^s o:i ^'^^ '"'^^^ numbers and r > 1, i.e., f G Sj{r). Then 

1/2 



Ii/ll2<2- 



-i/2 



E - 

, l = -r+l 



(32) 
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X 1/2 
2^-1 ^ 



l/'ll^ < 21+^/2 I ^ af\ forr>2, (33) 

ii=-r+l / 

(j \ 
E (34) 
i=-r+l / 

Furthermore, for every < s' < 1 there exists a finite constant Co(s') such that for every r > 2 
and f as above 

(3 \ 
E ■ (35) 

l=-r+l ) 

Proof. The first claim is well-known, see, e.g., DeVore and Lorentz (1993), Theorem 5.4.2. To 
prove use and the fact that TV^''"^^ vanishes outside of (0,r — 1) for r > 3 and outside 
of [0, 1) for r = 2, to obtain (interpreting the equality modulo A-nuUsets in case r = 2) 

2^-1 2^-1 

f'{x) = V E aiN^''^' {V X ~ I) ^ V E ai\N^'-^\V X - I) - N^'-^^V X - I - \) 

i=-r+l /=-r+l 

2^-1 2^-1 

= V E aiN^'-^^V X - I) - V E ai-iN^'-^\V X - I) fi + f2. 

l=~{r~l) + l i=-(r-l) + l 



Using ([32]) for /i and /2, we obtain 

||/'|l2<ll/l|l2 + ll/2|l2<2^+''/M E 

\l=-r+l 

The third claim is proved similarly. To prove the final claim, we use the following interpolation 
inequality: for every < s' < 1 there exists a finite constant C*(s') such that for every h e 

\\h\\s'a < C*{s'){\\hh + ||A>||2)^'||/i||^"^' (36) 

holds. [This follows from ([5]) if s' = 1; if s' < 1 it follows from Theorem 6.7.1 in DeVore 
and Lorentz (1993) applied to the intermediate spaces (R,R)s',oo, (£^,>V2)s',oo, and to the 
operator that maps any real number a into ah, observing that (£^, )s',oo is equal to Bg' up 
to a equivalence of norms, cf. p. 196 in DeVore and Lorentz (1993).] Observe that / £ W2 if 
r > 2. Now, using with h = f, ([5^ . and ([55]) completes the proof upon setting Co(s') — 

{2.5f C*{s'). m 

Lemma 3 Assume r > 1 and let 9 £ Q. 

a. Suppose the density p{9) is hounded. Then for all k > 1 and J > 1 

E\\pkjA&)~EpkjAS)\\l < Ci{e,r) — , 

where Ci{6,r) = {^^^)df. \\p{d)\\rx, with d^ defined in Proposition^ Furthermore, for r > 2 and 
< s' < 1 

E\\pk,jAS)-Epk,.j,m\l 2 < Co{s'fCi{e,r)'^ 
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holds for all k>\ and J > 1, where Co(s') is given in Lemma\^ 

b. If p{9) G , then for every k 

lim \\Epk,j.r{S)-p{0)\\^^Q- 

J— J-OO 

V € for some < t < r then for all k > 1 and J > 1 

\\Ep,jAo)-pm\2<^'''<\\pimt,2^ 

where c[ is the constant given in Proposition\^in Avvendix\M. 

c. If the assumptions of Part a (Part b) hold for (a version of) pq and in place of p{9) 
and r, respectively, then the results in Part a (Part b) also apply mutatis mutandis to Pn,j,r,- 

Proof. In view of Lemma [21 the definition of Pkjr{9), ([32]) and ([35]) . it suffices to bound 
E 



(^^\j\o) — E'j'fj{9)j in order to prove Part a. We obtain 



E 

22J 



,2,7 



< 



k 

22.J 



< ^\\pm\\ 




^ ?n— — r+l 
2'-l 



<TlbWlloo E (.9^")' 



m— — r+1 



<^di\\pm\ 



(37) 



where we have used independence, p2p . and Proposition This estabhshes Part a. [Measur- 
abihty of the >C^-norm is obvious, and measurabihty of the Besov-norm follows from Appendix 
IbI ] Since Epk^j,r{9) = 'k'J\p{9)), Part b follows from Proposition [8] in Appendix El Part c is 
proved completely analogously. ■ 

Lemma 4 Assume r > 3 and let 9 be an interior point of O such that the partial derivative 
^''^q'^^ at 9 exists for every i; G V. 

' dp{v,0) 



a. Suppose the density p{9) is bounded and sup^gy 
J > 1 



< oo. Then for all k > 1 and 



E 



dpk,J,r{9) _ ^dpk^J,r{9) 



d9g 89, 



dp{v,e) 



03J 

<C2i9,r) — , 



where C2{9,r) = 2(r + l)^^ \\p{9)\\ ^siip^^y 

b. Suppose there exists an open ball B{9) C Q with center such that ^gg'^-* and ^g^g ' exist 
on B{9) for every x G [0, 1] and w G V, suppose ^gg^' -* belongs to Bs for some < s < r, and 

r-l 



that 



/ sup 

Jo 9'£B{e) 



dp{9', x) 



d9a 



dx < oo. 



sup 

V e'<EB{e) 



dp{v,9') 



09, 



dp(v) < oo. 
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Then for all k > 1 and J > 1 



dp{0) 




dp{0) 






2 




s,2 



where the constant c'^ is defined in Proposition\^in Appendix\2i [If ^^sg' ^ € is weakened to 



MM^C^ then lim 



J— J-OO 



T^ dpk.j.Ae) _ dpje) 



holds.] 



Proof. Observe that Pk,j,r is differentiable at 9 because r > 3 is assumed. To prove Part a note 
that 



dpk,j.r{0) _ ^dpkj^riO) 



de„ 



de„ 



and that the £^-norm of this expression is measurable by Fubini's Theorem; also note that the 
expectations in the above display exist since the B-spline basis functions are bounded and since 



sup^gv 



dp(v,B) 



< oo has been assumed. Now, using the chain rule and ([33)) . we obtain 



E 



de„ 



r)2J 



2-' -1 




^ rn— — r+1 
2 



9 J N^j{x)\x=p{v.,e) 



(38) 



E (r)lm j.j-{r)f 
9j ^^mj 



m— — r+1 



23J+2 

< — - — dl sup 

k „gv 



dp{v,0) 



An application of Lemma [2] then completes the proof of Part a. 
To prove Part b, note that 



80. 



N^^l{x)pie,x)dx^ I N^^>{x)-^p{0,x)dx, 



q JO 



d 



where the two-fold interchange of integration and differentiation is permitted by dominated 
convergence in view of the maintained dominance assumptions on the derivatives of p and p as 
well as the boundedness of the B-spline basis functions and their first derivative. Consequently, 



E 



dpk,.j,r{d,y) 



2' E' E' 9^'"' fN^:li^)^p{O,x)dxNtj\y)^n[p(^pi0^^ 



(39) 



and Part b now follows immediately from Proposition [S] in Appendix El 
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Lemma 5 a. Suppose Assumption R(i) is satisfied, r > 2, Q is a bounded subset ofM.'', and 
supggQ l|p(^)lloo *^ Then there exist finite positive constants C3 and C4, depending only on 
Q, b, p, r, and supggQ l|p(^)lloo ^'^^ "''^ ^ '^"'^ smc/i i/iai 

/loZds /or all k > 1 and J > 1 satisfying 2^ J < C^k. Furthermore, for < s' < 1 



i;sup |bfc,j,.(0) - Epk.jAmi' 2 < Co(s')'C3 



2J(2s' + l) J 



(40) 



/loZds /or all k > 1 and J > 1 satisfying 2'^ J < C^k where Co{s') is given in Lemma\^ 

b. Suppose Assumption R(ii) is satisfied for some interior point 9 ofQ, swpg^gf^g^-j \\piO)\\^ < 
00 and r > 3 hold. Then there exist finite positive constants C5 and Cq, depending only on B{6q), 
b, p, r and supg^g^g^-j \\p{0)\\ao ^'"^ '^"^ ^ '^"■'^ -^j such that for every q — 1, . . . , 6 



E sup 

eeB(eo 



d d 



2^-^ J 
k 



holds for all k > 1 and J > 1 satisfying 2'^ J < C^k. 

c. Suppose the assumptions of Part b are satisfied except that now r > 4. Then there exist 
finite positive constants C-j and Cs, depending only on B{9q), b, p, r and supg^^j-gi^-) ||p(^')||oo ^'^^ 
not on k and J , such that for every q,q' = 1, . . . , 6 



E sup 



Pk.,.jA(^) - 0/1 a/i Pk.J,r{0) 



89,89,, 



89,89 A 



< Cr 



2^-^J 



holds for all k > 1 and J > 1 satisfying 2^ J < C^k. 
Proof, a. By Lemma [2] we have 

\Pk,jA^)-EpkMQ)\\l<'^~-^ 



Z —1 

E sup \\pt, J A&)-EPk J AO)\\l< 2-' E sup (^\'j> {9) -E^['j\9) 

see „ , 1 fee ^ 



l=-r+l 



Note that the suprema in the above display are measurable as the functions over which the 
suprema are taken depend continuously on 9 in view of assumption R(i) and r > 2. We bound 
the r.h.s. in the above display by applying the moment inequality given in Proposition [T^] in 
Appendix [Cj fix an arbitrary I and express the corresponding summand in the above display as 



where 



£;sup U^\9)-E^[^J{9) 



22,7 

— £;sup 
k'' eee 



i=l 



(41) 



2''-l 

he,iiv) = J2 9j 

m— — r+1 



(r)l'n 



N^rn!,ip{v,0))-EN^]ipiV.,9)) 



and set TLi^j^r = {hej : 9 £ 8}. Furthermore, set U = max (2,sup0g0 l|p(^)lli^^) a-nd cr^ 



2 •^U^. Then < a < U holds, and using the calculations that have led to ([57)1 we obtain for 
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every 9 e Q 



EhlAV,) < E 



2'-l 

E 

. m— — r+l 



<2-'di\\pm^<'^ 



-.1,2 



< sup||p(0)||^<a^ 
eee 



Furthermore, using ([28|. we obtain for every 9 



sup \hg^i\ < 2dr 



< 2dr < U. 



We next bound the uniform L°°-covering numbers of Hi^j.r'- observe that the elements of "Hlj^t 
satisfy for 6i, 6i' e 6 

sup \he^i{v) - he,,i{v)\ < 2-'+HrL U - e'll" , (42) 

■uev 

where L, a are the Holder constants from Assumption R(i) and where we have made use of the 
fact that N^'^'^ has Lipschitz constant bounded by 1 for r > 2; cf. the discussion at the beginning 
of this section. Since 6 is assumed to be bounded in R'', it can be covered by fewer than AI/S'' 
open balls with centers 0i G Q and radius 6, for < (5 < 1 where M depends only on 0. By (|42)) . 
the functions hg.^i in Hi.j^r corresponding to the ^^'s give rise to a covering of T-Lij.r by sup-norm 
balls of radius 2'^^^drL5°'. Consequently, the L°°-covering numbers satisfy 



7V(H,,j,,,L°°(V),e) <M 



2"^+M,.L 



b/a 



for < e < 2-^+^drL. 



(43) 



Replacing M by M* = M max (l, {U / {2drL)f'°'^ in (gS]), guarantees that (gg then holds for 
< £ < 2f7, which leads to 



N{ni,j,r, L°°(V),e) < {AU/ey for 0<e<2U, 



(44) 



for V — max(&/Q!, 2) and A — max (^2''^^ M^^'' drLU ^,2ej, where we have also enforced v >2 
and A > e. Note that, apart from the factor 2"^ , A depends only on 9, 5, p (via a and L), r (via 
dr), and supg^Q ll?'(^)lloo- Observe that Hi^j^r contains a countable sup-norm dense subset in 
view of (|42|) and separability of Q. Hence the expectation bound in Part a of Proposition [T2] in 
Appendix [C] applied to this subset and with 6o = now yields the existence of positive finite 
constants Cg and C4 both depending only on 0, 5, p, r, and supg^g lb(^)llco' ^^'A^ that for all 
J e N and all k > C^2"' J 

k 2 



E sup 

eee 



< C'zk2-^J. 



(45) 



Since this bound does not depend on the summation index Z, the proof of the first claim is 
complete upon setting C3 = (r + l)C'.^/2 and C4 = l/C^. The second claim follows immediately 
from applying ((35|) in Lemma [2] to the l.h.s. of (j40l) and using (|4T|) and (|45|). the measurability 
of the supremum in (j40p following from Appendix [BJ 

b. Observe that Pk,.j.r is continuously differentiable on B{6q) because of r > 3 and Assumption 
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R(ii). Similarly as in Part a we have measurability of the suprema and obtain from Lemma 



E sup 

eeBiOo) 



d 



d 



d0, 



< 



V —E sup 

i=-r+i ^ »ei3(eo) 



where 



2 -1 



^ m— — r+1 

Set Uf;]^^^ |4!i : ^ e S(6lo)| and define 



i7 = 2dr sup sup 



max 1, sup 

V 9es(eo) 



1/2 



and (7^ = 2 "^J/^. Then < cr < C/ holds (where we exclude the trivial case U Observing 



that ^^7(0;) = 2''Af('-)'(2'^ X — m) by the chain rule, we obtain, using the same calculations that 
have led to JSll), for 6* £ B{eo) 



Eh^gfiy,) < 2-''+^dl sup sup 



eeB(ea) vev 



dpiv,9) 



sup ||p(0)|L<a2. 



Furthermore, for every 9 e B{9q) 



sup 



.(1) 



< 2 sup 

vev 



dp{v,9) 



d9„ 



< 2dr sup sup 



dp{v,9) 



d9„ 



where we have made use of (j30p . To bound the uniform L°°-covering numbers of TLi^j^, observe 



that the elements of 'H^^jj. satisfy for 9, 9' e B{9q) 



sup 

2dr 



h'-sl{v)-hl)'{v) 



(1) 



< 



sup fiMY> p{v, 9)\\\\9 -9' \\+ 2-^+^ dr sup sup || Vep(i;, 6')||^ ||6i - 6l'| 



7V('') 

< 2'^+V^<! sup sup||V0p(i;,6')|| + sup s\xp\\\/ ep{v,9)f \ \\9 ~ 9'\\ <2-' cA\9 - 9' 



where we have made use of ((30)l . of the bound on the Lipschitz constant of N''^'^' given at the 
beginning of this section, and of the boundedness of B(9o); the constant c* is finite and depends 
only on p, r, and B(9q). Proceeding as in the proof of Part a we obtain 

iVCHpj,,, L°°(V),£) < {AU/eY for < e < 2C/, 
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for V = niax(6, 2) and A = max (2"'Mi/fcniax(c*f/^\l),2e) with M only depending on 5(6*0) 
Note that, apart from the factor 2'^, A depends only on B{9o), b, p, r and sup^g^^g^^ l|p(^')lloo 
Part a of Proposition [T^ in Appendix ICl applied to a comitable sup-norm dense subset of H^^-* 



l,J,r 



and with bo = v ^ now yields the existence of positive finite constants C5 and Cg depending 
only on B{9o), b, p, r and svi\igfzB{eo) lb(^)lloo' ^^'^h ^^^^ all J € N and aU k > Cg2'^J 



E sup 



i=l 



holds. Since this bound does not depend on I, the proof is complete upon setting C5 = (r+l)Cg/2 
and Ce = 

c. The proof is similar to the proof of Part b: Observe that Pk,j.r is twice continuously 
differentiable on B{9o) because of r > 4 and Assumption R(ii). By Lemma [5] we have 



E sup 

eeBieo) 



Pk,J,r{9) ~ E 



d9,d9g. 



Pk.J,r{9) 



25,7 



2'-l 



eeB{eo) 



where 



d9qd9,. 

^ ^ m— — r+1 



- 9J N'^''^'{2-^p{v,9)-m)-EN'^'-^'{2-'p{V,,9)~m) 



dp{v,9) dp{v,9) 
89, 89,, 



gj ^'"" [7V('')"(2^p(«, f?) - m) - S7VM"(2^p(l/„ 0) - m) 



m— — r+1 



Set n^^lr^ [hfj : 9 € B{9o)], set 



U = rfrmax< sup sup || V0p(w, 0) || + 4 sup sup || Vep(w, 6')Vep(f , 6')'|| , 
[eeB{eo)vev eeB{eo)vev 



sup ||p(6')| 
eeBiQo) 



1/2 

00 



z sup sup \\wIp{v,9)\\ +32 sup snp\\\/ 0p{v,9)\/0p{v,9y\\ 
eeB{eo)vev eeB(eo)vev 



1/2^ 



and 0-2 = 2--^U'^. Then < a < [/ holds (where we exclude the trivial case U = 0), and for 
9 e B{9o) we have 



Ehf}\Vi) < 2^-'''dl sup ||p(0)|L sup sup 



8^p{v,9) 



d9q89q> 



+2^ sup ||p(6l)||^ sup sup 



eeBieo) 



9eB(9o) v£V 



dp{v,9) dp{v,9) 



d9g 39,. 



using a calculation similar to the one that has led to psp and making use of Lemma [2l Similarly, 
for 9 e B{9q) we obtain 



sup 



< 2dr <{ 2^-^ sup 
vev 



d^piv,9) 



89,89,. 



+ sup 

oo,R ^gv 



8p{v,9) 8p{v,9) 



89, 89,. 
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using ||Af^''^'||^R < 1 and ||iV('^)"||^ < 2, cf. (ED). Furthermore, for 9, 0' e 5(6*0) we get 

again using ([HO)) . (PT|) . the bounds for the Lipschitz constants of N^^^' and N'^^'>" given at the 
beginning of this section, and boundedness of B{9o) 



sup 



hf}{v)-hfAv) 



< 2^-^drL'\\9 



-12dr sup sup II Ve/9(u, ( 

eeB{eo) vev 

■)J+3 



sup sup ||Vg/9(u, 0)|| ||0 

eeB{ea) vev 



+2''+'^dr sup sup ||Vep(w,6')|| sup sup || Ve/9(u, 6')|| 



^ 2 if; 



with the constant c,* being finite and depending only on B{9o), r, p. Proceeding as in the proof 
of Part a we obtain 

L°°(V), e) < (AU/ey for < e < 2U, 

where now v — max(6//3, 2) and A = max (2'^Af'^/''max(c,,[/"\l),2e) with M only depending 
on B{9q). Again, apart from the factor 2'^ , A depends only on B{9q), b, p, r, and supg^^^g^^j ||p(^)||o2- 

Part a of Proposition [T^] in Appendix [Cl applied to a countable sup- norm dense subset of "Hp],, 
and with bo — now yields the existence of positive finite constants Cy and Cg depending 
only on B{9q), 6, p, r, and sup^g^jg^^) lb(^)lloo' ^^^^ that for all J e N and all k > Cg2'^J 



E sup 

eee 



< Cljk2--'J 



holds. Since this bound does not depend on the proof is complete upon setting C7 — (r+l)C7/2 
and Cs = ■ 

Corollary 1 Suppose Assumption R(i) is satisfied and r > 2. Suppose further that O is a 
bounded subset ofM.^ and that {p(9) : 9 G Q} is bounded in Bj for some 1/2 < t < 1. // G N 
satisfies 

sup2"''=(2*+i)j^//c < 00, (46) 

k>l 



then supggQ \\pk,Jk,r{d)\\t,2 is stochastically bounded, i.e., 

lim supPr ( sup ||pfe,jj^,r(6')||t,2 > M 



0. 



// |^6p holds and Jk ^ 00 for k — > 00, then, for every <t' <t, supggQ \\Pk.Jk,r{()) —p{0)\\t',2 
as well as supggQ ||pfc,jfc,r(^) ^p(^)||oo converge to zero in (outer) probability as k — > 00. 

Proof. Observe that under (|46p we have 2"''= < C^k for k large enough, where C4 is as in 
Lemma [5l and that {p{9) : 9 G 0} is sup-norm bounded. Now, using Lemma [5] together with 
Ljapunov's inequality as well as Proposition 1^] in Appendix VK[ we arrive, for k large enough, at 

-ESUP |bfc,Jfc,r(6')||t,2 < £^SUp Wpk.J^A^) - Epk,J^,ri9)\\t,2 + SUp \\Epk,J^A9)\\t.2 

eee eee eee 



< Coit)y/Qi2 



^.M^ + sup||4Vp(^?))||,2 



fee 



< (70(^)^(^3 sup 2 

fc>i 



Jkt 



2''''Jk 



+ c'/sup |b(6')||(,2 < 00, 



see 
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where we have used the already established fact that Epk,j^.r{0) — nj^{p{9)). [Measurabil- 
ity of supggQ ||Pfe,jfc,r(^)||t,2 follows from Appendix [B]] Together with the observation that 
Esupg^Q Ibfe,jfc,r(^)lk2 < oo for every k > 1, this completes the proof of the first claim. Next, 
Lemma [5] (applied with s' = t') gives for k large enough {E* denoting outer expectation) 

E* sup \\pk,j„M - Pm\f.2 < Esnp \\pk,.j,A0) ~ Epk,.j„rm\t'.2 + sup \\7r^J^{p{9))-pi9)y.2 
flee flee flee 

< Co(t')v^2^'=*\/5^+ 2-^'=(*-*')<;, sup \\pmt,2, 
V k 

where we have used Proposition [S] in Appendix [X] in the final step. The upper bound now 
converges to zero as fc — )■ oo. The claim regarding the sup- norm now follows from Proposition [7] 
in Appendix El ■ 

The following corollary is proved analogously using Lemma [3] instead of Lemma [5l with 
measurability of the relevant quantities following from Appendix [Bl 

Corollary 2 Suppose r* > 2 and that po e Bt for some 1/2 < t < I. // jVi G N satisfies 

sup2J"(2t+i)/^ < oo, (47) 

n>l 

then \\pnj„,r,\\t,2 is stochastically hounded, i.e., 

lim supPr(||p„j„.rJ|t,2 > M) = 0. 

// J^?] ) holds and jn — > oo for n — >■ oo, then, for every < t' < t, \\pn.j„.r._, — Pa\\t' as well as 
||Pn,j„,r, — Polloo converge to zero in probability as n oo, where po is the continuous version of 
Po- 

7 Uniform Central Limit Theorems for Spline Projection 
Estimators 

We now study the difference between the random (signed) measure Pk,j,r{9) given by 

dPk,j,r{d){y) = Pk,j,r{d,y)dy 

and Pk{d), acting on Besov classes by integration. In the following WiyWjr stands for supj^jr |i^(/)|, 
where is a (signed) measure. 

Theorem 3 Suppose Assumption R(i) is satisfied, r > 2, Q is a bounded subset ofM.^, and 
{p{0) : ^ 0} is a bounded subset of Bt for some t, < t < r. Let T be a (non-empty) bounded 
.subset of Bs for some s, 1/2 < s < 1. Then for every 1/2 < s' < s there is a finite positive 
constant Cg, depending only on s, s' , t, T , 0, h, a, L, and {p{0) : 9 £ Q} but not on J and k, 
such that for every J > 1 and k > 1 

EsnpWPkMd) - Pkim^ < Cg{2-'^'+'^ + 2-'^'-''h-'/^). (48) 
flee 

Furthermore, 

sup \\Pk{9) - P{e)y = Op(fc-i/2) (49) 
flee 



33 



holds. Finally, if Jk ^ oo as fc —> oo satisfies 2 = o(fc ^^^), then for every 9 £ Q 

where Gp^Q^ is a sample-bounded and sample- continuous generalized P{9)-Brownian bridge in- 
dexed by J- . Here goo (^jr-^ denotes convergence in law as defined in Chapter 1 of van der Vaart 
and Wellner (1996). 

Proof. We first note that supggQ \\Pk,j,r{0) — PkiP)\\j: and sup^gg, \\Pk[Q) — P{())\\t are mea- 
surable since they can be represented as suprema over countable dense subsets of Q and T in 
view of Assumption R(i), r > 2, and separability of F. For / G we can write, using (O, ([8]), 

,(r) 



12)) and symmetry of the projection kernel Kj , 

k 



(PkjAS) - Pk{o))if) 



f{y)K^j\xm,y)dy- f{X0)) 



k 1 

T EC'^J^l/) - f)iMO)) = (Pkie) Pi9mn^j\f) - /) + / (Tr^jHf) f){y)p{9){y)dy 



A + B. 



Consider first term B: Using / €! C^, p{0) € C^, self-adjointness and idempotency of the projec- 



(r) 

tion Id — IT J we obtain 



^''j\f)~l) {y)p{0){y)dy 



Id 



Ap)f) iy)[[ld-n[p)p{e)) {y)dy 



< 



f-npif) ^ p{9)~7r);>{p{e)) 



< c'A\\.f\\s.2\\pmk2^-"^'^'\ 



(50) 



where we have used Proposition [5] for the last inequality. Consider next the term A: Define for 
J > 1 the class of functions 



■Pj,r,, 



which allows us to write 



KP{p{;e),y)f{y)dy - 0)) : / G ^, G 6 

{{7r[pif)^f){p{;9)):feT,0ee}, 



E sup sup 



iP,i9)~Pi9))i7r[p{f)-f) 



-E sup 

1^ hdTj.r 



J2 im) - Eh{v,)) 



(51) 



(52) 



Choose an arbitrary s' satisfying 1/2 < s' < s and observe that {T^'"j\f) — /) G Bg C B^,' since 
C by assumption and that Sj{r) C B^ C B^,' in view of s < 1 < r — 1/2. Propositions [7] 
and |9] in Appendix |2] then give 

sup sup\h{v) - Eh{Vi)\ < 2 sup sup |/i(i;)| < 2 sup tt'^PU) - f 



< 2cs' sup 



n^if) - / < 2c,,c- , sup 2-^(-^') U 

s',2 ■ fer 
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where U < oo since is a (non-empty) bounded subset of B^. We may assume U > 0, the case 
U = being trivial. Since J-j^r.p contains a countable sup-norm dense subset in view of Proposi- 
tion [S] below, we may apply the moment inequality from Proposition [T^ part b, in Appendix [C] 
to ([32]) (with U as above, cr = {/, A' = c*^'/ ^2cs'c"'j, sup^gjr ||/||g 2)' with w = 1/s') and 

make use of the entropy bound in Proposition [5] below with e* — Acs'c'^'g, sup^gj^- 1|/||^ 2 — 
This gives the bound 



E sup sup 



{pm pmi^Tif) - /) < 2-'(^-')+ifc-v2c,,c- , sup 11/11, 2 &2 



where the constant 62 only depends on A' and w. Together with ([5n|. this proves the bound 
(|15|). To prove the second claim, define the class 



and note that J-p is uniformly bounded since T is and that 



(53) 



suv\\Pk{0) - Pm\T ^ T sup 



^ (M^.) - £;/i(v,)) 



Now (|49)) follows since J^p is a universal Donsker class by Proposition|6]below. The third claim of 
the theorem follows immediately from (|48l) with s' chosen to satisfy s' < s, from the assumptions 
on Jfe, and from the universal Donsker property of {f{p{-,9)) : / G J^} for every 9, which it 
inherits from J-p. ■ 

Proposition 6 Suppose Assumption R(i) is satisfied, r > 2, and Q is a hounded subset of M.'' . 
Let J- be a (non-empty) bounded subset of B^, 1/2 < s < 1. Let J-.j,r,p cind J-p be defined as in 
\5V^ and 153\) . Then for every 1/2 < s' < s and every e* > there exists a (positive) finite 
constant c* , depending only on s, s' , J- , 0, h, a, L, and e* but not on J , such that for every 
J > 1 

logN{Tj^r,p, L°°(V),£) < 2-^(^-^')/'*'c*e-i/''' for < e < e* (54) 

holds. Furthermore, for every e* > there exists a (positive) finite constant c** (depending only 
on s, J- , 0, h, a, L, and e* ) such that 



\ogN{Tp,L°^{V),e)<c**e- 



-l/s 



forO<e<e* 



(55) 



holds. In particular, J-p and J-j.r,p a^'e universal Donsker classes. 
Proof. Let s' be as in the proposition. By Proposition [S] 



sup 



< 2-^^'-''^c'!'., sup 11/11, 2 = 2-'^^'^-^ >D < 00, 



— 9-./(s-s') 



(56) 



where the constant D depends only on s, s', and As a consequence, 

is contained in a ball Uj in Bg/ of radius 2~'^'^*~* ^D. Using entropy bounds for balls in Besov 
spaces (e.g.. Theorem 15.6.1 in Lorentz, v.Golitschek, and Makovoz (1996)) we obtain 



log N{g J, L°°{[0,l]),e) < 2-'^("-'')/"'c(s,s', J")£-i/"' for < e < c5o 
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where the finite and positive constant c{s,s',J-) depends only on s, s', and (in particular, 
it is independent of J). [Setting p = 2, g = oo in Lorentz, v.Golitschek, and Makovoz (1996) 
we actually obtain the above bound only in the ess-sup norm. However, since Gj consists of 
continuous functions only and since we can always assume that the centers of the covering ess- 
sup norm balls belong to Gj (perhaps at the expense of doubling e), we immediately obtain the 
same bound for the supremum-norm.] 

To prove the entropy bound for J^j^r.p = {g{p[',(^)) : 5 G ^j, 6* G B} we proceed as follows: 
Note that the elements of Qj are Holder continuous of order s' — 1/2 with Holder constants 
uniformly bounded by 2^ ci{s' , D) , with < ci{s',D) < oo depending only on s' and 
D, since Qj Uj Q Qs' and since for 1/2 < s' < 1 the space Bj' is continuously embedded 
into C cf. Proposition [7] in Appendix [X] Define -q = {a{s' ~ 1/2))"^ with a defined 

in Assumption Rl. For < e < 1 set (5 = ^2'^'^*^'' ^e^ and cover 9 by (5-balls with centers 

Oi, . . . ,9n(s,0) where N{S,Q) satisfies N{S,Q) < max(l, Af(0)/(5'') for some constant M{Q) 
only depending on Q. Let gi, ■ ■ ■ , gN{gj,L°°{[o,i]),£) the centers of L°°([0, l])-balls of radius e 
covering Qj. We then have for g{p{-,9)) S J^j,r,p using Assumption Rl 

sup \g{p{v, 9)) - g^{p{v,9i)\ 
vev 

< sup \gipiv, 9)) - g{p{v, 9i))\ + sup \g{p{v, 9i)) - g,{p{v, 9i))\ 
vev vev 

< 2-'^'-''^c,is',D){L\9-9i\y'-'^' + sup \gix) - g,ix)\ < U{s' , D)L'/^ + l) e 

xe[o,i] ^ ' 

for suitable choice of i and I. Consequently, we obtain for < e < 1 

log7V(J-j,,,p, L°°(V), (ci(s', B)L^I^ + l) e) < log A^(g,7, L°°([0, 1]),£) + logiV(5, 6) 

< c(s,s',^) (2''(^-"')e)"'^' +log+ (Af(e)/(2^(^""')e)'"') < c,2--^^'-'"^l'' e-^l'\ 

for a suitable finite constant c, only depending on s, s', T ^ 0, 6, and a, but not on J. After a 
simple substitution, this gives ((54)) for < e < ci{s' ,D)L^^^ + 1. Appropriately adjusting the 
multiplicative constant in this so-obtained bound gives ([M)) for all < e < £*; note that the 
adjustment of the constant only introduces an additional dependence on e* (but no dependence 
on J). The entropy bound (1551) for J-p is proved in a similar (even simpler) way. The Donsker 
property of J-j,r.p and J-p now follows from (IMl) . ([55]) and Theorem 2.8.4 in van der Vaart and 
Wellner (1996), noting that J^j,r,p and J-p are uniformly bounded in view of Proposition [7] and 
that the bracketing covering numbers are dominated by the sup-norm covering numbers. ■ 

An analogous result holds for the random (signed) measure Pn.j,r, given by dPn.j^-, (y) = 
Pn,j,r,{y)dy. The proof of this result is similar to, in fact simpler than, the proof of Theorem [3] 
and thus is omitted. 

Theorem 4 Suppose r* > 2, and po G Bt for some t, < t < r^. Let F he a (non-empty) 
hounded suhset of for some s, 1/2 < s < 1. Then for every 1/2 < s' < s there is a finite 
positive constant Cio independent of j (only depending on s, s' , t, J-, andpo) such that for every 
J > 1 and k > 1 

Furthermore, ||P„ — P\\j^ = Op{n^^^^) holds. Finally, if jn — > oo as n oo satisfies 2"-'"^*+*^ = 
o{n~^^^), then 

Vn{Pn,j„.r, - P) •^<?~(Jf) Gp, 
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where Gp is a sample-bounded and sample-continuous generalized P-Brownian bridge indexed by 



A Appendix: Some Properties of Besov Spaces and Ap- 
proximation by Splines 

In the following, we summarize some simple properties of the spaces Bg. For < s < 1 and 
bounded / : [0, 1] ^ R denote by 

ii/ii.„ = ii/iu+ s„p 

a:,!/e[0,l],x#j/ K y\ 

the usual Holder norm and denote by the set of all functions / with finite ||/||s,oo- For 
simplicity we restrict ourselves to the case s < 1 in the following proposition. 

Proposition 7 Let 1/2 < s < 1. 

a. Every f € Bg is X-a.e. equal to a function f G C^^/^ and 



< f 



(s-l/2),cx) 



<Cs f = C. 

s,2 



\\s,2 



holds for some finite (positive) constant Cg that depends only on s. 

b. If f e Bs and h G Bs, then \\fh\\^^^ < 2c« ||/||,_2 II^IL,2- If h e Bg satisfies C := 
inf,e[o,i] h{x) > 0, then \\l/h\l^^ < + C"'' l|/i|L,2- 

Proof, a. Observe that Bs coincides (up to norm equivalence) with the intermediate space 
(£^,W2)s,cx) (DeVore and Lorentz (1993), p. 196) and hence coincides with the Besov space 
B*'^'°°((0, 1)) defined in Adams and Fournier (2003) (the fact that the latter is defined on the 
open unit interval being irrelevant). The claim then follows from applying Theorem 7.37 in 
Adams and Fournier (2003) (with m = n = l,j=0,p = 2, q = oo). 

b. Since s < 1 by assumption, we may set a = 1 in the definition of the Besov (semi)norm. 
Elementary calculations then show that 

II AIL,2 < ll/IL,2 esssup \h\ + esssup |/| < 2c, ||/i|L,2 

in view of Part a. The second claim follows since clearly ||l//i||2 < and since elementary 
calculations give llA^/i"^!!^ < ll^z'^112- ■ 

The above proposition, together with the continuous embedding of Bt into Bs for t > s 

(DcVorc and Lorentz (1993), p. 56), immediately guarantees for every t > 1/2 the existence of a 
constant Cj, < Ct < oo, such that for every f £ Bt there exists a (unique) continuous /, A-a.e. 
equal to /, such that ||/||oo < Cf ||/||t,2 = C4||/||t,2- In particular, bounded subsets of Bj, t > 1/2, 
are sup-norm bounded. 

As is well known, functions in Bg can be approximated by elements of the Schoenberg spaces 
Sj{r), the error decreasing as j increases. We summarize these facts in the following proposition. 

Proposition 8 Suppose r € N. 

a. IfhGjC^, then the ortho-projection operator ■k'^P from C? onto the Schoenberg space Sj{r) 
satisfies 

lim ||7r;.'"'(/i)-/i||2 = 0. 
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IfH is a relatively compact subset of C^, then 

lim sup||4''^(/i)-/i||2 = 0. 

b. If h & Bs for some s € (0, r), then 

\\4\h)-hh<2-^^c'M\s,^, 
for every j G N, where the (positive) finite constant c'^ depends only on s. 

Proof. To prove the first claim in Part a, observe that by Proposition 2.4.1 and (12.3.2) in 
DeVore and Lorentz (1993) 

Ikf (/i)-/i||2<2CW sup ||AJ(/l)||2 

0<z<2-3 

for some universal constant C^^K By continuity of translation in £^(R) (cf., e.g., Folland (1999), 
Proposition 8.5) the right-hand side converges to zero as j — >■ oo (note that ||AJ(/i)||2 is less than 
or equal to the corresponding expression that is obtained when h is viewed as a function on R 
which is zero outside of [0, 1]). The second claim in Part a follows since for every e > and e-net 
{hi:l<l< N{e)} for H we have that \\h- hi\\2 < s implies \\n'f\h) - '!T'f\hi)\\2 < e and thus 

sup \\T,f{h) - h\\2 < ^ max \\7T'f\hi) - hih + 2s 

holds. For the proof of Part b use Proposition 2.4.1 and (12.3.2) in DeVore and Lorentz (1993) 
(where one sets p = 2, n = 2^) together with the definition of the Besov-norm. ■ 

Proposition 9 Suppose r S N. Let h G Bs for some s S (0,r — 1/2). Then 

\\4\h)\u2<c':\\h\u,2, 

for every j G N, where the (positive) finite constant c" depends only on s. Furthermore, for every 
s' e (0, s] 

\\4\h)~h\u>,2<2-^(^-^'^c':[Ah\\s,2 

for every j G N, where the (positive) finite constant c"'^, depends only on s and s' . 

Proof. By Theorem 12.3.3. in DcVorc and Lorentz (1993) (with p = 2, X = r — 1/2, q = oo, 

a = s, and dn^r{')2 defined on p. 358 of that reference) wc have 

\\4^{h)h,2 = \\4\h)\\2+ sup |^|-^||A;(7r«(/i))||2 

< ||/i||2 + e,sup2"«d„,,(7r^.''^(/i))2 

n>0 

< ||/.||2 + e,sup2-||7rW(7r5'-)(ft))-7r;.'-)(/.)||2 

n>0 

< \\h\\2 + es sup 2"«||7r(:)(/i) - 7T'[\h)\\2 < \\h\U,2 + 2esc'Mk2 

0<n<j 



38 



for some universal constant e^, where we have used Proposition [5] in the last step. To prove the 
second claim we argue as before and then use Proposition [5] to obtain 

ht^{h) - /i||.',2 < h\''\h) -hh + e, sup2"^'|KW(7r('-)(/i) -K)- {ixf(h) - h)h 



n>0 



< ||7rf)(;^)-/i||2 + ( 

< 2-^'^4||;i||,,2 + e.- 



(/i) - + sup2"«'|KW(/i) - 



n>j 



2-'"(^'-^)4||;i||,,2 + sup2"(^'-^)4||/i||, 



< 2--''(^-'^')(l + 2e,0c'JI'^ll. 



B Appendix: Consistency of the Indirect Inference Esti- 
mator and Measurability Issues 

Proof of Proposition [TJ Because of continuity of the B-spline basis functions for r > 2 
and continuity of — ^ for every u e V, the map 9 — )■ Pk,j,r{()){y) is continuous for every 

y S [0, 1]. Furthermore, Pnj.r, and Pk,j.r{(^) are bounded on [0, 1], the latter one uniformly in 0, in 
view of the discussion surrounding (jl3p . Next note that the set An appearing in the definition of 
Qn,k coincides with the event {inf j,g[o,i] Pn,j,r, (y) > O}, since Pn.,j,r, is continuous on [0, 1] in case 
> 1, and is piecewise constant in case = 1. Hence, by the dominated convergence theorem, 
Qn,k is continuous (and real- valued) on 9 ii Pn,j,r,{y) > for every y G [0,1]; and the same 
conclusion trivially holds in the other case. As mentioned before, Qn,k{0) : [0, 1]°° x V°° -> R is 
^[Ji] '8'2J°°-measurable for every 6* € 8. Since 9 is compact, existence of a measurable minimizer 
then follows, e.g., from Lemma A3 in Potscher and Prucha (1997). ■ 

Proposition 10 Suppose 9 is compact in M'', that the map 9 p{9,x) is continuous on 9 
for every x G [0,1] and that supggQ l|p(^)lloo ^ Furthermore, assume that > 1 holds. 
Then there exists a ^j^i] -measurable 9^ that minimizes Qn{0) over 9. (In fact, 9n is 

^-measurable as it does not depend on the simulations.) 

Proof. Since ||Pn.j,r, ll,^ < 00 and since on the event An also infyg[o.i] > holds, the 

assumptions on p{9) and the dominated convergence theorem imply that Qn is real-valued and 
continuous in 9 on the event A„; and the same conclusion trivially holds on the complement of 
An. Furthermore, Q3jJ^] (g)2J°°-measurability of Q„(6') : [0, 1]°° x V°° -J> R for every 6* g 9 follows 
from Tonelli's Theorem since Pn,j,r, is jointly measurable (and An is measurable). Since 9 is 
compact, existence of a measurable minimizer then follows, e.g., from Lemma A3 in Potscher 
and Prucha (1997). ■ 

Proposition 11 Suppose Assumptions Pl(i),(ii) are satisfied and r, > 2 holds. If jn —> 00 as 
n ^ CO in such a way that for some 5 > 1/2 we have sup„>]^ 2''"^^'^+^-'/n < 00 then 

On ^ ^0 *^ Vv -probability as n ^ 00, 

where 9n has been defined in Section \5.2l 

The proof of this result is completely analogous to the proof of Proposition [5] and is thus 
omitted. 
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Remark 5 (Measurability issues) (i) For every J > 1, r > 1, and 9 € &, the expressions 
\\Pk,jA^)\\2' lbfe,^,r-Wlloo' '^"^ \\Pk,jA^)\\s,2 (f^''' ^ < T - 1/2) are measurable functions of 
vi,...,Vk, since the coefficients 7;j\^) are measurable. This is obvious for the C^-norm, but 

holds in general for the following reason: observe that any one of the norms m,entioned, when 
restricted to Sj{r), is a continuous function of the coefficients 7;j''(^) because Sj{r) is finite- 
dimensional. The same is true ifpk,j,r{6) is replaced by pk,j,r{()) — Epk,j,r{6) or pk,j,r{0) — p{0), 
in the latter case provided the respective norm ofp{9) is finite. [The argument is the same, except 
that Sj{r) is to be replaced by the linear span of Sj{r) U {p{0)} for establishing the latter claim.] 
Analogous statements obviously also hold for Pn,j,r, for every j > 1, r > 1. (ii) The reasoning 
just given in fact establishes that the above mentioned norms ofpk. j,r{9) and pk,j,r{(^) — Epk,j,r{9) 
are continuous functions of 6, provided the coefficients (^) ( ^Ilj (^) ) '^^^ continuous in 
6 (which is, e.g., the case if r >2 and Assumption R(i) holds); consequently, suprema over 6 of 
the above mcjitioned norms of pk.j.r{9) and pk.j,r{^) ^ ■^Pk.J.r{9) are then measurable. [We note 
thai this argument does not apply to suprema of norms of pk,j,r{9) —p{9), because p{9) may not 
vary in a finite- dimensional space when 6 varies.] 



C Appendix: Moment Bounds for Empirical Processes 



The following moment inequalities can be deduced from a general theorem in Gine and Koltchin- 
skii (2006) and a refinement with explicit constants in Gine and Nickl (2009a). 

Proposition 12 Let Zi, i € N, be i.i.d. random variaMes with values in a measurable .space 
{S,A) and common law R. Let T be a countable R-centered class of real valued measurable 
functions from {S, A) to R. Assume that T is uniformly bounded by a finite positive constant U 
and let further a, < a < U , be some constant satisfying sup f ^ jr Ef^{Zi) < a^. 
a. Assume that the £^{Q)- covering numbers satisfy 

suplogiV(J-,£2(Q),T) < vlog (^^^ > 0<T<2U, 

for some A > e and v > 2 (the supremum extending over all probability measures Q on S). 
Then, for every bo > satisfying 



na^ > bovU"^ log {5AU/a) for all n e N, 



(57) 



there exists a finite positive constant bi{v, bo), that depends only on v and bo, such that for every 



E 



AU 

< bi{v,ho)na'^\og 

(7 



holds. 

b. Assume that the C^{Q)- covering numbers satisfy 
suplog N{F,C\Q),t) <( — 

Q V T 

for some < A' < oo and < lo < 2. Then, for all n G 
depends only on A',w, we have 



0<T<2U, 

and some positive constant 62, that 



E 



< b2VnU. 
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Proof. Since the results depend only on the distribution of /(■^i)lljr, we may assume 

w.l.o.g. that - as in Gine and Koltchinskii (2006) - the random variables are realized as coordinate 
projections on the infinite product space of {S,A). The second claim of the proposition then 
follows directly from Theorem 3.1 in Gine and Koltchinskii (2006) applied to the class J-' = 
{f/U : / G J"} with envelope F = 1 and H{x) = (A'x)"^ for x > 1/2 and H{x) = for < a; < 
1/2. The first claim is proved as follows: By Proposition 3.1 in Gine, Latala and Zinn (2000) 
(applied to U {—J^) and observing that in that reference is bounded by na^ in our notation) 
we have 



E 



E 

i=l 



E 



2na^ + 4[/2 



where X is a universal constant. We then bound the first term on the right-hand side by using 
Proposition 3 in Gine and Nickl (2009a) and simplify the resulting bound using ([57]) . A > e, and 
U/a > 1 to arrive at the result. ■ 
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